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(57) Abstract: A method of producing a replication defective retrovirus comprising transfecting a producer cell with the following: 
iii) a retroviral genome; iv) a nucleotide sequence coding for retroviral gag and pol proteins; and iii) nucleotide sequences encod- 
ing other essential viral packaging components not encoded by the nucleotide sequence of (ii); characterised in that the nucleotide 
sequence coding for retroviral gag and pol proteins is codon optimised for expression in the producer cell. 
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Method 

Field of the Invention 

The present invention relates to methods of improving the safety of retroviral 
5 vectors capable of delivering therapeutic genes for use in gene therapy, and to 
novel nucleotide sequences for use in such methods. 

Background to the Invention 

10 Retroviral vectors are now widely used as vehicles to deliver genes into cells. 
Their popularity stems from the fact that they are easy to produce and mediate 
stable integration of the gene that they carry into the genome of the target cell. 
This enables long-term expression of the delivered gene (1). 

1 5 There has been considerable interest, for some time, in the development of retroviral 
vector systems based on lentiviruses. Lentiviruses are a small subgroup of complex 
retroviruses. They contain, in addition to the common retroviral genes (gag, pol and 
env\ genes which enable them to regulate their life cycle and to infect non-dividing 
cells (2). Vector systems based thereon are therefore of interest because of their 

20 potential use in the transfer of a gene of interest to non-dividing cells such as 
neurones. In addition, lentiviral vectors enable very stable long-term expression of 
the gene of interest. This has been shown to be at least three months for 
transduced rat neuronal cells while MLV based vectors were only able to express 
the gene of interest for six weeks. 

25 

The most commonly used lentivirus is the Human Immunodeficiency Virus (HIV), 
the etiologic agent of AIDS (acquired immune deficiency syndrome). HIV-based 
vectors have been shown to efficiently transduce non-diving cells (3) and can be 
used, for example, to target anti-HIV therapeutic genes to HIV susceptible cells. 

30 
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However, HTV vectors have a number of significant disadvantages that may limit 
their therapeutic application to certain diseases. In particular, HTV-1 is a human 
pathogen carrying potentially oncogenic proteins and sequences. There is the risk 
that introduction of vector particles produced in packaging cells which express 
5 HTV gag-pol will introduce these proteins into the patient leading to 
seroconversion. 

Emphasis has therefore been placed on the safety of these vectors. One strategy 
looks at the design of production systems for retroviral vectors. A retrovirus 

10 vector system basically consists of two elements, a packaging cell line and a 
vector genome. The simplest packaging line consists of a provirus in which the v|/ 
sequence (a determinant of RNA packaging reporting in HIV as lying between 
U5 and gag) has been deleted. When stably transfected into a cell, virus particles 
containing reverse transcriptase will be produced but virion RNA will not 

15 become packaged within these particles. The complementing component in a 
retrovirus vector system is the genome vector itself. The genome vector needs to 
contain a packaging sequence but much of the structural coding regions can be 
deleted. Often a selectable marker gene, or other nucleotide sequence of interest, 
is incorporated into the vector. Vector stocks of the packaging line can then be 

20 used to infect target cells. Provided the cell is successfully infected by the viral 
particle, the genome vector sequence will be reverse transcribed and integrated by 
the retroviral machinery. However, infection is an end process so no further 
replication or spread of the vector should occur. 

25 As indicated above, however, problems are encountered in the design of safe and 
effective retroviral vectors. These include the possibility that recombination 
between the packaging vector and the packaging sequence can lead to the 
generation of wild type replication competent virus. Consequently efforts have 
been directed at improving the safety of packaging cell constructs. 

30 
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In second generation packaging cell lines, in addition to deletion of the packaging 
sequence, the 3' LTR was also deleted so that two recombinations are necessary 
to generate a wild type virus. 

5 In third generation packaging lines the gag-pol genes and env gene are placed on 
separate constructs that are sequentially introduced into the packaging cells to 
prevent recombination during transfection. 

With regard to the packaging signal, EP 0 368 882A (Sodroski) discloses that in 
10 HIV it corresponds to the region between the 5' major splice donor and the gag 
initiation codon, and particularly corresponds to a segment just downstream of 
the 5' major splice donor, and about 14 bases upstream of the gag initiation 
codon. It is this region which Sodroski teaches should be deleted from the gag- 
pol cassette. W097/12622 (Verma) describes that in HIV-1 a 39 bp internal 
15 deletion in the iy sequence can be made between the 5' splice donor site and the 
starting codon of the gag gene. 

Codon wobbling can be used to reduce recombination frequency while 
maintaining the primary protein sequence of the constructs, c.f. (4) in which the 

20 region of overlap between the gag-pol and env expression constructs was reduced 
to 61 bp extending over the common region between pol and env which are in 
different reading frames. Transversion mutations were introduced into the final 
20 codons of pol, retaining the integrity of the coding region while reducing the 
homology with env to 55% in the overlap region. Similarly wobble mutations 

25 were introduced into the 3' of env and all sequences downstream of the env stop 
codon were deleted. 

Efficient vectors usually contain part of gag on the genome vector to increase 
virion titre. Unlike the packaging sequence which can be in any position within a 
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sequence to effect packaging, the gag sequence must be in its native position 
adjacent to \j/ to have any effect. 

It will be appreciated that whilst significant improvements in packaging cell and 
5 vector design have been made there is still scope for further refinement of current 
packaging lines. 

Summary of the Invention 

10 It is therefore an aim of the invention to provide retroviral particles, in particular 
lentiviral particles, and particularly those which carry nucleotide constructs 
encoding therapeutic proteins, that have improved safety over the corresponding 
wild type viral particle. In our W099/41397 we describe codon optimisation of 
the gag-pol genes as a means of overcoming the Rev/RRE requirement for export 

15 and to enhance RNA stability. We have now found however that the codon 
optimised gag-pol sequence overcomes potential recombination problems with 
vector genomes which carry part of a gag sequence with the aim of increasing 
titre. This strategy also avoids the need to use gag regions from different viruses 
in the packaging and vector genome constructs. 

20 Another significant advantage provided by the invention is that the codon 
optimisation disrupts RNA secondary structures, such as the packaging signal, 
thus rendering the gag-pol mRNA non-packagable. Thus, the present invention 
allows retroviral sequence upstream of the gag initiation codon to be retained, in 
contrast to Sodroski and Verma, without significantly compromising safety. 

25 

Statements of the Invention 

Accordingly in one aspect the present invention provides use of a nucleotide 
* sequence coding for retroviral gag and pol proteins, capable of assembly of a 
retroviral vector genome into a retroviral particle in a producer cell, to generate a 
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replication defective retrovirus iir a target cell, wherein the nucleotide sequence is 
codon optimised for expression in the producer cell. 

Thus in one embodiment the present invention provides use of a nucleotide 
sequence coding for retroviral gag and pol proteins capable of assembly of a 
5 retroviral vector genome into a retroviral particle in a producer cell to reduce or 
prevent packaging of the retroviral vector genome in a target cell, wherein the 
nucleotide sequence is codon optimised for expression in the producer cell. 

In another embodiment the present invention provides use of a nucleotide 
10 sequence coding for retroviral gag and pol proteins, capable of assembly of a 
retroviral vector genome comprising at least part of a gag nucleotide sequence 
into a retroviral particle in a producer cell, to reduce or prevent recombination 
between said nucleotide sequence coding for retroviral gag and pol proteins and 
the at least part of a gag nucleotide sequence, wherein the nucleotide sequence 
15 coding for retroviral gag and pol proteins is codon optimised for expression in the 
producer cell. 

Put another way the present invention provides a method of producing a 
replication defective retrovirus comprising transfecting a producer cell with the 
20 following: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retroviral gag and pol proteins; 
and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

Thus in one embodiment the present invention provides a method of reducing or 
preventing packaging of a retroviral genome in a target cell comprising the steps 
of: 
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a. transfecting a producer cell with the following to produce 
retroviral particles: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retroviral gag and pol proteins; 

and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by one or more of the nucleotide 
sequences of (ii); and 

b. transfecting a target cell with retroviral particles of step (a); 
characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

In another embodiment the present invention provides a method to reduce or 
prevent recombination between a retroviral vector genome and a nucleotide 
sequence encoding a viral polypeptide required for the assembly of the viral 
genome into retroviral particles comprising transfecting a producer cell with the 
following: 

(i) a retroviral genome comprising at least part of a gag nucleotide 
sequence; 

(ii) a nucleotide sequence coding for retroviral gag and pol proteins; 

and 

(iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

We also provide novel codon optimised sequences as shown in SEQ ID NOS: 15 
and 16 and which may be used in the present invention. However, it will be 
appreciated that any convenient codon optimised gag-pol sequence may be 
employed in the invention. 
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The present invention further provides a retroviral particle produced using the 
# sequences of the present invention, and production methods for so doing. 

5 The present invention also provides a pharmaceutical composition comprising a 
viral particle according to the present invention, together with a pharmaceutically 
acceptable diluent or carrier. 

By "reducing" we mean that the chance of an event occurring is reduced 
10 • compared to a comparable population havingg the wild-type gag-pol sequence. 
Within a population the chance of an event occurring may be prevented for an 
individual retrovirus vector or particle. 

Detailed Description of the Invention 

Various preferred features and embodiments of the present invention will now be 
described by way of non-limiting example. 

The present invention employs the concept of codon optimisation. 

20 

Codon optimisation has previously been described in our W099/41397 as a 
means of overcoming the Rev/RRE requirement for export and to enhance RNA 
stability. The alterations to the coding sequences for the viral components 
improve the sequences for codon usage in the mammalian cells or other cells 
25 which are to act as the producer cells for retroviral vector particle production. 
This improvement in codon usage is referred to as "codon optimisation". Many 
viruses, including HIV and other lentiviruses, use a large number of rare codons 
and by changing these to correspond to commonly used mammalian codons, 
increased expression of the packaging components in mammalian producer cells 
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can be achieved. Codon usage tables are known in the art for mammalian cells, 
as well as for a variety of other organisms. 

By virtue of alterations in their sequences, the nucleotide sequences encoding the 
5 packaging components of the viral particles required for assembly of viral 
particles in the producer cells/packaging cells have RNA instability sequences 
(INS) eliminated from them. At the same time, the amino acid coding sequence 
for the packaging components is retained so that the viral components encoded 
by the sequences remain the same, or at least sufficiently similar that the function 
10 of the packaging components is not compromised. 

The term "viral polypeptide required for the assembly of viral particles" means a 
polypeptide normally encoded by the viral genome to be packaged into viral 
particles, in the absence of which the viral genome cannot be packaged. For 
15 example, in the context of retroviruses such polypeptides would include gag-pol 
and env. The term "packaging component" is also included within this definition. 

As discussed in our W099/32646, the sequence requirements for packaging HIV 
vector genomes are complex. The HTV-1 packaging signal encompasses the 

20 splice donor site and contains a portion of the 5' -untranslated region of the gag 
gene, which has a putative secondary structure containing 4 short stem-loops. 
However, additional sequences elsewhere in the genome are also known to be 
important for efficient encapsidation ofHTV. For example, the first 350 bps of 
the gag protein coding sequence may contribute to efficient packaging. Thus, for 

25 construction of HTV-1 vectors capable of expressing heterologous genes, a 
packaging signal extending to 350 bps of the gag protein-coding region has been 
used on the vector genome. We have now found that codon optimisation of the 
gag coding region on the packaging vector, at least in the region into which the 
packaging signal extends, also has the effect of disrupting packaging of the vector 
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genome. Thus codon optimisation is a novel method of obtaining a replication 
defective viral particle. 

Also as disclosed in W099/32646, the structure of the packaging signal in equine 
5 lentiviruses is different from that of HIV. Instead of a short sequence of 4 stem 
loops together with a packaging signal extending to 350 bps of the gag protein- 
coding region, we have found that in equine lentiviruses the packaging signal 
may not extend as far into the gag protein-coding region as may have been 
thought. 

10 

In one embodiment only codons relating to the packaging signal are codon 
optimised. Thus, in one embodiment, codon optimisation extends to at least the 
first 350 bps of the gag protein coding region. In equine lentiviruses, at least, 
codon optimisation extends to at least nucleotide 300 of the gag coding region, 
15 more preferably to at least nucleotide 150 of the gag coding region. Although not 
optimal, codon optimisation could extend to, say, only the first 109 nucleotides of 
the gag coding region. It may also be possible for codon optimisation to extend 
to only the first codon of the gag coding region. 

20 However, in a much more preferred and practical embodiment, the sequences are 
codon optimised in their entirety, with the exception of the sequence 
encompassing the frameshift site. 

The gag-pol gene comprises two overlapping reading frames encoding gag and 
25 pol proteins respectively. The expression of both proteins depends on a 
frameshift during translation. This frameshift occurs as a result of ribosome 
"slippage" during translation. This slippage is thought to be caused at least in 
part by ribosome-stalling RNA secondary structures. Such secondary structures 
exist downstream of the frameshift site in the gag-pol gene. For HIV, the region 
30 of overlap extends from nucleotide 1222 downstream of the beginning of gag 
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(wherein nucleotide 1 is the A of the gag ATG) to the end of gag (nt 1503). 
Consequently, a 281 bp fragment spanning the frameshift site and the 
overlapping region of the two reading frames is preferably not codon optimised. 
Retaining this fragment will enable more efficient expression of the gag-pol 
5 proteins. 

For EIAV the beginning of the overlap has been taken to be nt 1262 (where 
nucleotide 1 is the A of the gag ATG). The end of the overlap is at 1461 bp. In 
order to ensure that the frameshift site and the gag, gag-pol overlap the wild type 
10 sequence has been retained from nt 1 1 56 to 1465. This can be seen in Figure 9b. 

Derivations from optimal codon usage may be made, for example, in order to 
accommodate convenient restriction sites, and conservative amino acid changes 
may be introduced into the gag-pol proteins. 

15 

In a highly preferred embodiment, codon optimisation was based on lightly 
expressed mammalian genes. The third and sometimes the second and third base 
may be changed. An example of a codon usage table is given in Figure 3b. 

20 Due to the degenerate nature of the Genetic Code, it will be appreciated that 
numerous gag-pol sequences can be achieved by a skilled worker. Also there are 
many retroviral variants described and which can be used as a starting point for 
generating a codon optimised gag-pol sequence. Lentiviral genomes can be quite 
variable. For example there are many quasi-species of HIV-1 which are still 

25 functional. This is also the case for EIAV. These variants may be used to 
enhance particular parts of the transduction process. Examples of HTV-l 
variants may be found at http://hiv-web.lanl.gov . Details of EIAV clones may be 
found at the NCBI database: http://www.ncbi.nkn.nih.gov . 
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The strategy for codon optimised gag-pol sequences can be used in relation to 
any retrovirus. This would apply to all the lentiviruses, including EIAV, FIV, 
BIV 3 CAEV, VMR, SIV, HIV-1 and HIV-2. In addition this method could be 
used to increase expression of genes from HTLV-1, HTLV-2, HFV, HSRV and 
5 human endogenous retroviruses (HERV). 

As codon optimisation may result in disruption of RNA secondary structures such 
as the packaging signal, it will be appreciated that any endogenous packaging 
signal upstream of the gag initiation codon could be retained without 
1 0 compromising safety. 

An additional advantage of codon optimising packaging components is that this 
can increase gene expression. In particular, it can render gag-pol expression Rev 
independent. In order to enable the use of anti-rev or RRE factors in the retroviral 

15 vector, however, it would be necessary to render the viral vector generation 
system totally Rev/RRE independent (5). Thus, the genome also needs to be 
modified. This is achieved by optimising vector genome components. 
Advantageously, these modifications also lead to the production of a safer system 
absent of all accessory proteins both in the producer and in the transduced cell, 

20 and are described below. 

As described above, the packaging components for a retroviral vector include 
expression products of gag, pol and env genes. In addition, efficient packaging 
depends on a short sequence of 4 stem loops followed by a partial sequence from 

25 gag and env (the "packaging signal"). Thus, inclusion of a deleted gag sequence 
in the retroviral vector genome (in addition to the full gag sequence on the 
packaging construct) will optimise vector titre. To date efficient packaging has 
been reported to require from 255 to 360 nucleotides of gag in vectors that still 
retain env sequences, or about 40 nucleotides of gag in a particular combination 

30 of splice donor mutation, gag and env deletions. We have surprisingly found that 
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a deletion of up to 360 nucleotides in gag leads to an increase in vector titre. 
Further deletions resulted in lower titres. Additional mutations at the major 
splice donor site upstream of gag were found to disrupt packaging signal 
secondary structure and therefore lead to decreased vector titre. Thus, preferably, 
5 the retroviral vector genome includes a gag sequence from which up to 360 
nucleotides have been removed. 

We therefore allow the preparation of a so-called "minimal" system in which all 
of the accessory genes may be removed. In HIV these accessory genes are vpr, 
10 vif, tat, nef, vpu and rev. Similarly, in other lentiviruses the analogous accessory 
genes normally present in the lentivirus may be removed. For the avoidance of 
doubt, however, we would mention that th epresent invention also extends to 
systems, particles and vectors in which one or more of these accessory genes is 
present and in any combination. 

15 

The term "viral vector" refers to a nucleotide construct comprising a viral 
genome capable of being transcribed in a host cell, which genome comprises 
sufficient viral genetic information to allow packaging of the viral RNA genome, 
in the presence of packaging components, into a viral particle capable of infecting 

20 a target cell. Infection of the target cell includes reverse transcription and 
integration into the target cell genome, where appropriate for particular viruses. 
The viral vector in use typically carries heterologous coding sequences 
(nucleotides of interest or "NOIs") which are to be delivered by the vector to the 
target cell, for example a first nucleotide sequence encoding a ribozyme. By 

25 "replication defective" we mean that a viral vector is incapable of independent 
replication to produce infectious viral particles within the final target cell. 

The term " viral vector system" is intended to mean a kit of parts which can be 
used when combined with other necessary components for viral particle 
30 production to produce viral particles in host cells. For example, an NOI may 
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typically be present in a plasmid vector construct suitable for cloning the NOI 
into a viral genome vector construct. When combined in a kit with a further 
nucleotide sequence, which will also typically be present in a separate plasmid 
vector construct, the resulting combination of plasmid containing the NOI and 
5 plasmid containing the further nucleotide sequence comprises the essential 
elements of the invention. Such a kit may then be used by the skilled person in 
the production of suitable viral vector genome constructs which when transfected 
into a host cell together with the plasmid containing the further nucleotide 
sequence, and optionally nucleic acid constructs encoding other components 
10 required for viral assembly, will lead to the production of infectious viral 
particles. 

Alternatively, the further nucleotide sequence may be stably present within a 
packaging cell line that is included in the kit. 

15 

The kit may include the other components needed to produce viral particles, such 
as host cells and other plasmids encoding essential viral polypeptides required for 
viral assembly. By way of example, the kit may contain (i) a plasmid containing 
an NOI and (ii) a plasmid containing a further nucleotide sequence encoding a 

20 modified retroviral gag-pol construct which has been codon optimised for 
expression in a producer of choice. Optional components would then be (a) a 
retroviral genome construct with suitable restriction enzyme recognition sites for 
cloning the NOI into the viral genome, optionally with at least a partial gag 
sequence; (b) a plasmid encoding a VSV-G env protein. Alternatively, 

25 nucleotide sequence encoding viral polypeptides required for assembly of viral 
particles may be provided in the kit as packaging cell lines comprising the 
nucleotide sequences, for example a VSV-G expressing cell line. 
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The term "viral vector production system" refers to the viral vector system 
described above wherein the NOI has already been inserted into a suitable viral 
vector genome. 

5 In the present invention, several terms are used interchangeably. Thus, "virion", 
"virus", 'Viral particle", "retroviral particle", "retrovirus", and "vector particle" 
mean virus and virus-like particles that are capable of introducing a nucleic acid 
into a cell through a viral-like entry mechanism. Such vector particles can, under 
certain circumstances, mediate the transfer of NOIs into the cells they infect A 
10 retrovirus is capable of reverse transcribing its genetic material into DNA and 
incorporating this genetic material into a target cell's DNA upon transduction. 
Such cells are designated herein as "target cells". 

As used herein the term "target cell" simply refers to a cell which the regulated 
15 retroviral vector of the present invention, whether native or targeted, is capable of 
infecting or transducing. 

A lentiviral vector particle according to the invention will be capable of 
transducing cells which are slowly-dividing, and which non-lentiviruses such as 
20 MLV would not be able to efficiently transduce. Slowly-dividing cells divide 
once in about every three to four days including certain tumour cells. Although 
tumours contain rapidly dividing cells, some tumour cells especially those in the 
centre of the tumour, divide infrequently. 

25 Alternatively the target cell may be a growth-arrested cell capable of undergoing 
cell division such as a cell in a central portion of a tumour mass or a stem cell 
such as a haematopoietic stem cell or a CD34-positive cell. 

As a further alternative, the target cell may be a precursor of a differentiated cell 
30 such as a monocyte precursor, a CD3 3 -positive cell, or a myeloid precursor. 
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As a further alternative, the target cell may be a differentiated cell such as a 
neuron, astrocyte, glial cell, microglial cell, macrophage, monocyte, epithelial 
cell, endothelial cell, hepatocyte, spermatocyte, spermatid or spermatozoa. 

5 

Target cells may be transduced either in vitro after isolation from a human 
individual or may be transduced directly in vivo. 

Viral vectors according to the invention are retroviral vectors, in particular 
10 lentiviral vectors such as HIV and EIAV vectors. The retroviral vector of the 
present invention may be derived from or may be derivable from any suitable 
retrovirus. A large number of different retroviruses have been identified. 
Examples include: murine leukemia virus (MLV), human immunodeficiency 
virus (HIV), simian immunodeficiency virus, human T-cell leukemia virus 
15 (HTLV). equine infectious anaemia virus (EIAV), mouse mammary tumour 
virus (MMTV), Rous sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), 
Moloney murine leukemia virus (Mo-MLV), FBR murine osteosarcoma virus 
(FBR MSV), Moloney murine sarcoma virus (Mo-MSV), Abelson murine 
leukemia virus (A-MLV), Avian myelocytomatosis virus-29 (MC29), and Avian 
20 erythroblastosis virus (AEV). A detailed list of retroviruses may be found in 
Coffin et al t 1991, "Retroviruses", Cold Spring Harbour Laboratory Press Eds: 
JM Coffin, SM Hughes, HE Varmus pp 758-763. 

The term "derivable" is used in its normal sense as meaning a nucleotide sequence 
25 such as an LTR or a part thereof which need not necessarily be obtained from a 
vector such as a retroviral vector but instead could be derived therefrom. By way of 
example, the sequence may be prepared synthetically or by use of recombinant 
DNA techniques. 
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Details on the genomic structure of some retroviruses may be found in the art. 
By way of example, details on HIV and Mo-MLV may be found from the NCBI 
Genbank (Genome Accession Nos. AF033819 and AF033811, respectively).- 
Details of HIV variants may also be found at http://hiv-web.lanhgov . Details of 
5 EIAV variants may be found through http://www.ncbi.nlm.nih.gov . 

The lentivirus group can be split even further into "primate" and "non-primate". 
Examples of primate lentiviruses include human immunodeficiency virus (HIV), 
the causative agent of human auto-immunodeficiency syndrome (AIDS), and 
10 simian immunodeficiency virus (SIV). The non-primate lentiviral group includes 
the prototype "slow virus" visna/maedi virus (VMV), as well as the related 
caprine arthritis-encephalitis virus (CAEV), equine infectious anaemia virus 
(EIAV) and the more recently described feline immunodeficiency virus (TTV) 
and bovine immunodeficiency virus (BIV). 

15 

The basic structure of a retrovirus genome is a 5' LTR and a 3' LTR, between or 
within which are located a packaging signal to enable the genome to be packaged, 
a primer binding site, integration sites to enable integration into a host cell 
genome and gag, pol and env genes encoding the packaging components - these 
20 are polypeptides required for the assembly of viral particles. More complex 
retroviruses have additional features, such as rev and RRE sequences in HIV, 
which enable the efficient export of RNA transcripts of the integrated provirus 
from the nucleus to the cytoplasm of an infected target cell. 

25 In the provirus, these genes are flanked at both ends by regions called long 
terminal repeats (LTRs). The LTRs are responsible for proviral integration, and 
transcription. LTRs also serve as enhancer-promoter sequences and can control 
the expression of the viral genes. Encapsidation of the retroviral RNAs occurs by 
virtue of apsi sequence, which it has been disclosed in respect of HIV, at least, is 

30 located at the 5' end of the viral genome. 
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The LTRs themselves are identical sequences that can be divided into three 
elements, which are called U3, R and U5. U3 is derived from the sequence 
unique to the 3' end of the RNA. R is derived from a sequence repeated at both 
5 ends of the RNA and U5 is derived from the sequence unique to the 5' end of the 
RNA. The sizes of the three elements can vary considerably among different 
retroviruses. 

In a defective retroviral vector genome gag, pol and env may be absent or not 
10 functional. The R regions at both ends of the RNA are repeated sequences. U5 
and U3 represent unique sequences at the 5' and 3' ends of the RNA genome 
respectively. 

As discussed above, in a typical retroviral vector for use in gene therapy, at least 
15 part of one or more of the gag, pol and env protein coding regions essential for 
replication may be removed from the viral vector. This makes the retroviral 
vector replication-defective. The removed portions may even be replaced by a 
nucleotide sequence of interest (NOI), as in the present invention, to generate a 
virus capable of integrating its genome into a host genome but wherein the 
20 modified viral genome is unable to propagate itself due to a lack of structural 
proteins. When integrated in the host genome, expression of the NOI occurs - 
resulting in, for example, a therapeutic and/or a diagnostic effect. Thus, the 
transfer of an NOI into a site of interest is typically achieved by: integrating the 
NOI into the recombinant viral vector; packaging the modified viral vector into a 
25 virion coat; and allowing transduction of a site of interest - such as a targeted cell 
or a targeted cell population. 

A minimal retroviral genome for use in the present invention may therefore 
comprise (5') R - U5 - one or more NOIs - U3-R (3'). However, the plasmid 
30 vector used to produce the retroviral genome within a host cell/packaging cell 
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will also include transcriptional regulatory control sequences operably linked to 
the retroviral genome to direct transcription of the genome in a host 
cell/packaging cell. These regulatory sequences may be the natural sequences 
associated with the transcribed retroviral sequence, i.e. the 5' U3 region, or they 
5 may be a heterologous promoter such as another viral promoter, for example the 
CMV promoter. 

Some retroviral genomes require additional sequences for efficient virus 
production. For example, in the case of HIV, rev and RRE sequence should be 
10 included. However, we have found that the requirement for rev and RRE can be 
reduced or eliminated by codon optimisation. As expression of the codon 
optimised gag-pol is REV independent, RRE can be removed from the gag-pol 
expression cassette, thus removing any potential for recombination with any RRE 
contained on the vector genome. 

15 

Once the retroviral vector NOIs sequences need to be expressed. In a retrovirus, 
the promoter is located in the 5' LTR U3 region of the provirus. In retroviral 
vectors, the promoter driving expression of a therapeutic gene may be the native 
retroviral promoter in the 5' U3 region, or an alternative promoter engineered 
20 into the vector. The alternative promoter may physically replace the 5' U3 
promoter native to the retrovirus, or it may be incorporated at a different place 
within the vector genome such as between the LTRs. 

Thus, the NOI will also be operably linked to a transcriptional regulatory control 
25 sequence to allow transcription of the first nucleotide sequence to occur in the 
target cell. The control sequence will typically be active in mammalian cells. The 
control sequence may, for example, be a viral promoter such as the natural viral 
promoter or a CMV promoter or it may be a mammalian promoter. It is 
particularly preferred to use a promoter that is preferentially active in a particular 
30 cell type or tissue type in which the virus to be treated primarily infects. Thus, in 
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one embodiment, a tissue-specific regulatory sequences may be used. The 
regulatory control sequences driving expression of the one or more first 
nucleotide sequences may be constitutive or regulated promoters. 

5 The term "operably linked" denotes a relationship between a regulatory region 
(typically a promoter element, but may include an enhancer element) and the 
coding region of a gene, whereby the transcription of the coding region is under 
the control of the regulatory region. 

10 As used herein, the term "enhancer" includes a DNA sequence which binds other 
protein components of the transcription initiation complex and thus facilitates the 
initiation of transcription directed by its associated promoter. 

In one preferred embodiment of the present invention, the enhancer is an 
15 ischaemic like response element (ILRE). 

The term "ischaemia like response element" - otherwise written as ILRE - 
includes an element that is responsive to or is active under conditions of 
ischaemia or conditions that are like ischaemia or are caused by ischaemia. By 
20 way of example, conditions that are like ischaemia or are caused by ischaemia 
include hypoxia and/or low glucose concentration^). 

The term sc hypoxia" means a condition under which a particular organ or tissue 
receives an inadequate supply of oxygen. 

25 

Ischaemia can be an insufficient supply of blood to a specific organ or tissue. A 
consequence of decreased blood supply is an inadequate supply of oxygen to the 
organ or tissue (hypoxia). Prolonged hypoxia may result in injury to the affected 
organ or tissue. 

30 
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A preferred ILRE is a hypoxia response element (HRE). 

In one preferred aspect of the present invention, there is hypoxia or ischaemia 
regulatable expression of the retroviral vector components. In this regard, 
5 hypoxia is a powerful regulator of gene expression in a wide range of different 
cell types and acts by the induction of the activity of hypoxia-inducible 
transcription factors such as hypoxia inducible factor- 1 (HEM; 6), which bind to 
cognate DNA recognition sites, the hypoxia-responsive elements (HREs) on 
various gene promoters. Dachs et al (7) have used a multimeric form of the HRE 
10 from the mouse phosphoglycerate kinase- 1 (PGK-1) gene (8) to control 
expression of both marker and therapeutic genes by human fibrosarcoma cells in 
response to hypoxia in vitro and within solid tumours in vivo (7 ibid). 

Hypoxia response enhancer elements (HREEs) have also been found in 
15 association with a number of genes including the erythropoietin (EPO) gene (9; 
10). Other HREEs have been isolated from regulatory regions of both the muscle 
glycolytic enzyme pyruvate kinase (PKM) gene (11), the human muscle-specific 
P-enolase gene (EN03; 12) and the endothelin-1 (ET-1) gene (13). 

20 Preferably the HRE of the present invention is selected from, for example, the 
erythropoietin HRE element (HREE1), muscle pyruvate kinase (PKM), HRE 
element, phosphoglycerate kinase (PGK) HRE, B-enolase (enolase 3; EN03) 
HRE element, endothelin-1 (ET-l)HRE element and metallothionein II (MTU) 
HRE element. 

25 

Preferably the ILRE is used in combination with a transcriptional regulatory 
element, such as a promoter, which transcriptional regulatory element is 
preferably active in one or more selected cell type(s), preferably being only active 
in one cell type. 

30 
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As outlined above, this combination aspect of the present invention is called a 
responsive element. 

Preferably the responsive element comprises at least the ILRE as herein defined. 

5 

Non-limiting examples of such a responsive element are presented as OBHRE1 
and XiaMac. Another non-limiting example includes the ILRE in use in 
conjunction with an MLV promoter and/or a tissue restricted ischaemic 
responsive promoter. These responsive elements are disclosed in W099/15684. 

10 

Other examples of suitable tissue restricted promoters/enhancers are those which 
are highly active in tumour cells such as a promoter/enhancer from a MUC\ 
gene, a CEA gene or a ST4 antigen gene. The alpha-fetoprotein (AFP) promoter 
is also a tumour-specific promoter. One preferred promoter-enhancer 
15 combination is a human cytomegalovirus (hCMV) major immediate early (MIE) 
promoter/enhancer combination. 

The term "promoter" is used in the normal sense of the art, e.g. an RNA 
polymerase binding site. 

20 

The promoter may be located in the retroviral 5' LTR to control the expression of 
a cDNA encoding an NOI, and/or gag-pol proteins. 

Preferably the NOI and/or gag-pol proteins are capable of being expressed from 
25 the retrovirus genome such as from endogenous retroviral promoters in the long 
terminal repeat (LTR). 

Preferably the NOI and/or gag-pol proteins are expressed from a heterologous 
promoter to which the heterologous gene or sequence, and/or codon optimised 
30 gag-pol sequence is operably linked. 
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Alternatively, the promoter may be an internal promoter. 
Preferably the NOI is expressed from an internal promoter. 

5 

Vectors containing internal promoters have also been widely used to express 
multiple genes. An internal promoter makes it possible to exploit 
promoter/enhancer combinations other than those found in the viral LTR for 
driving gene expression. Multiple internal promoters can be included in a 
10 retroviral vector and it has proved possible to express at least three different 
cDNAs each from its own promoter (14). Internal ribosomal entry site (IRES) 
elements have also been used to allow translation of multiple coding regions from 
either a single mRNA or from fusion proteins that can then be expressed from an * 
open reading frame. 

15 

The promoter of the present invention may be constitutively efficient, or may be 
tissue or temporally restricted in their activity. 

Preferably the promoter is a constitutive promoter such as CMV. 

20 

Preferably the promoters of the present invention are tissue specific. That is, they 
are capable of driving transcription of a NOI or NOI(s) in one tissue while 
remaining largely "silent" in other tissue types. 

25 The term "tissue specific" means a promoter which is not restricted in activity to 
a single tissue type but which nevertheless shows selectivity in that they may be 
active in one group of tissues and less active or silent in another group. 

The level of expression of an NOI or NOIs under the control of a particular 
30 promoter may be modulated by manipulating the promoter region. For example, 
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10 



different domains within a promoter region may possess different gene regulatory 
activities. The roles of these different regions are typically assessed using vector 
constructs having different variants of the promoter with specific regions deleted 
(that is, deletion analysis). This approach may be used to identify, for example, 
the smallest region capable of conferring tissue specificity or the smallest region 
conferring hypoxia sensitivity. 

A number of tissue specific promoters, described above, may be particularly 
advantageous in practising the present invention. In most instances, these 
promoters may be isolated as convenient restriction digestion fragments suitable 
for cloning in a selected vector. Alternatively, promoter fragments may be 
isolated using the polymerase chain reaction. Cloning of the amplified fragments 
may be facilitated by incorporating restriction sites at the 5 ' end of the primers. 

15 The NOI or NOIs may be under the expression control of an expression 
regulatory element, such as a promoter and enhancer. 

Preferably the ischaemic responsive promoter is a tissue restricted ischaemic 
responsive promoter. 

20 

Preferably the tissue restricted ischaemic responsive promoter is a macrophage 
specific promoter restricted by repression. 

Preferably the tissue restricted ischaemic responsive promoter is an endothelium 
25 specific promoter. 

Preferably the regulated retroviral vector of the present invention is an ILRE 
regulated retroviral vector. 
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Preferably the regulated retroviral vector of the present invention is an ILRE 
regulated lentiviral vector. 

Preferably the regulated retroviral vector of the present invention is an 
5 . autoregulated hypoxia responsive lentiviral vector. 

Preferably the regulated retroviral vector of the present invention is regulated by 
glucose concentration. 

10 For example, the glucose-regulated proteins (grp's) such as grp78 and grp94 are 
highly conserved proteins known to be induced by glucose deprivation (15). The 
grp 78 gene is expressed at low levels in most normal healthy tissues under the 
influence of basal level promoter elements but has at least two critical "stress 
inducible regulatory elements" upstream of the TATA element (15 ibid; 16). 

15 Attachment to a truncated 632 base pair sequence of the 5 5 end of the grp78 
promoter confers high inducibility to glucose deprivation on reporter genes in 
j vitro (16 ibid). Furthermore, this promoter sequence in retroviral vectors was 
capable of driving a high level expression of a reporter gene in tumour cells in 
murine fibrosarcomas, particularly in central relatively ischaemic/fibrotic sites 

20 (16 ibid), 

Preferably the regulated retroviral vector of the present invention is a self- 
inactivating (SIN) vector. 

25 By way of example, self-inactivating retroviral vectors have been constructed by 
deleting the transcriptional enhancers or the enhancers and promoter in the U3 
region of the 3' LTR. After a round of vector reverse transcription and 
integration, these changes are copied into both the 5' and the 3' LTRs producing a 
transcriptionally inactive provirus (17; 18; 19; 20). However, any promoters) 

30 internal to the LTRs in such vectors will still be transcriptionally active. This 
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strategy has been employed to eliminate effects of the enhancers and promoters 
in the viral LTRs on transcription from internally placed genes. Such effects 
include increased transcription (21) or suppression of transcription (22). This 
strategy can also be used to eliminate downstream transcription from the 3' LTR 
5 into genomic DNA (23). This is of particular concern in human gene therapy 
where it is of critical importance to prevent the adventitious activation of an 
endogenous oncogene. 

As * discussed above, replication-defective retroviral vectors are typically 
10 propagated, for example to prepare suitable titres of the retroviral vector for 
subsequent transduction, by using a combination of a packaging or helper cell 4 
line and the recombinant vector. That is to say, that the three packaging proteins 
can be provided in trans. 

15 In general a "packaging cell line" contains one or more of the retroviral gag, pol 
and env genes. In the present invention it contains codon optimised gag-pol 
genes, and optionally an env gene. The packaging cell line produces the proteins 
required for packaging retroviral DNA but it cannot bring about encapsidation. 
Conventionally this has been achieved through lack of a psi region. However, 

20 when a recombinant vector carrying an NOI and a psi region is introduced into 
the packaging cell line, the helper proteins can package the /arc-positive 
recombinant vector to produce the recombinant virus stock. This virus stock can 
be used -to transduce cells to introduce the NOI into the genome of the target 
cells. Conventionally a psi packaging signal, called psi plus, has been used that 

25 contains additional sequences spanning from upstream of the splice donor to 
downstream of the gag start codon (24) since this has been shown to increase 
viral titres. 

The recombinant virus whose genome lacks all genes required to make viral 
30 proteins can tranduce only once and cannot propagate. These viral vectors which 
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are only capable of a single round of transduction of target cells are known as 
replication defective vectors. Hence, the NOI is introduced into the host/target 
cell genome without the generation of potentially harmful retrovirus. A summary 
of the available packaging lines is presented in Coffin et al, 1997 (ibid). 

5 

The retroviral packaging cell line is preferably in the form of a transiently 
transfected cell line. Transient transfections may advantageously be used to 
measure levels of vector production when vectors are being developed. In this 
regard, transient transfection avoids the longer time required to generate stable 

10 vector-producing cell lines and may also be used if the vector or retroviral 
packaging components are toxic to cells. Components typically used to generate 
retroviral vectors include a plasmid encoding the gag-pol proteins, a plasmid 
encoding the env protein and a plasmid containing an NOI. Vector production 
involves transient transfection of one or more of these components into cells 

15 containing the other required components. If the vector encodes toxic genes or 
genes that interfere with the replication of the host cell, such as inhibitors of the 
cell cycle or genes that induce apotosis, it may be difficult to generate stable 
vector-producing cell lines, but transient transfection can be used to produce the 
vector before the cells die. Also, cell lines have been developed using transient 
20 transfection that produce vector litre levels that are comparable to the levels 
obtained from stable vector-producing cell lines (25). 

Producer cells/packaging cells can be of any suitable cell type. Producer cells are 
generally mammalian cells but can be, for example, insect cells. A producer cell 

25 may be a packaging cell containing the virus structural genes, normally integrated 
into its genome into which the regulated retroviral vectors of the present 
invention are introduced. Alternatively the producer cell may be transfected with 
nucleic acid sequences encoding structural components, such as codon optimised 
gag-pol and env on one or more vectors such as plasmids, adenovirus vectors, 

30 herpes viral vectors or any method known to deliver functional DNA into target 
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cells. The vectors according to the present invention are then introduced into the 
packaging cell by the methods of the present invention. 

As used herein, the term "producer cell" or "vector producing cell" refers to a cell 
5 which contains all the elements necessary for production of regulated retroviral 
vector particles and regulated retroviral delivery systems. 

Preferably, the producer cell is obtainable from a stable producer cell line. 

10 Preferably, the producer cell is obtainable from a derived stable producer cell 
line. 

Preferably, the producer cell is obtainable from a derived producer cell line 

15 As used herein, the term "derived producer cell line" is a transduced producer cell 
line which has been screened and selected for high expression of a marker gene. 
Such cell lines contain retroviral insertions in integration sites that support high 
level expression from the retroviral genome. The term "derived producer cell 
line" is used interchangeably with the term "derived stable producer cell line" and 

20 the term "stable producer cell line 

Preferably the derived producer cell line includes but is not limited to a retroviral 
and/or a lentiviral producer cell. 

25 Preferably the derived producer cell line is an HIV or EIAV producer cell line, 
more preferably an EIAV producer cell line. 

Preferably the envelope protein sequences, and nucleocapsid sequences are all 
stably integrated in the producer and/or packaging cell. However, one or more of 
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these sequences could also exist in episomal form and gene expression could 
occur from the episome. 

As used herein, the term "packaging cell" refers to a cell which contains those 
5 elements necessary for production of infectious recombinant virus which are 
lacking in a recombinant viral vector. Typically, such packaging cells contain 
one or more vectors which are capable of expressing viral structural proteins 
(such as codon optimised gag- pol and env) but they do not contain a packaging 
signal. 

10 

The term "packaging signal" which is referred to interchangeably as "packaging 
sequence" or "psi" is used in reference to the non-coding, cw-acting sequence 
required for encapsidation of retroviral RNA strands during viral particle 
formation. In HTV-1, this sequence has been mapped to loci extending from 
1 5 upstream of the major splice donor site (SD) to at least the gag start codon. 

Packaging cell lines suitable for use with the above-described vector constructs 
may be readily prepared (see also WO 92/05266), and utilised to create producer 
cell lines for the production of retroviral vector particles. As already mentioned, a 
20 summary of the available packaging lines is presented in "Retroviruses" (1997 
Cold Spring Harbour Laboratory Press Eds: JM Coffin, SM Hughes, HE Varmus 
pp449). 

Also as discussed above, simple packaging cell lines, comprising a provirus in 
25 which the packaging signal has been deleted, have been found to lead to the rapid 
production of undesirable replication competent viruses through recombination. 
In order to improve safety, second generation cell lines have been produced 
wherein the 3'LTR of the provirus is deleted. In such cells, two recombinations 
would be necessary to produce a wild type virus. A further improvement involves 
30 the introduction of the gag-pol genes and the env gene on separate constructs so- 
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called third generation packaging cell lines. These constructs are introduced 
sequentially to prevent recombination during transfection (26; 27). 

Preferably, the packaging cell lines are second generation packaging cell lines. 

5 

Preferably, the packaging cell lines are third generation packaging cell lines. 

In these split-construct, third generation cell lines, a further reduction in 
recombination may be achieved by "codon wobbling". This technique, based on 
10 the redundancy of the genetic code, aims to reduce homology between the 
separate constructs, for example between the regions of overlap in the gag-pol 
and env open reading frames. 

The packaging cell lines are useful for providing the gene products necessary to 
15 encapsidate and provide a membrane protein for a high titre regulated retrovirus 
vector and regulated nucleic gene delivery vehicle production. When regulated 
retrovirus sequences are introduced into the packaging cell lines, such sequences 
are encapsidated with the nucleocapsid (gag-pol) proteins and these units then 
bud through the cell membrane to become surrounded in cell membrane and to 
20 contain the envelope protein produced in the packaging cell line. These 
infectious regulated retroviruses are useful as infectious units per se or as gene 
delivery vectors. 

The packaging cell may be a cell cultured in vitro such as a tissue culture cell 
25 line. Suitable cell lines include but are not limited to mammalian cells such as 
murine fibroblast derived cell lines or human cell lines. Preferably the packaging 
cell line is a human cell line, such as for example: HEK293, 293-T, TE671, 
HT1080. 
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Alternatively, the packaging cell may be a cell derived from the individual to be 
treated such as a monocyte, macrophage, blood cell or fibroblast. The cell may 
be isolated from an individual and the packaging and vector components 
administered ex vivo followed by re-administration of the autologous packaging 
5 cells. 

It is highly desirable to use high-titre virus preparations in both experimental and 
practical applications. Techniques for increasing viral titre include using apsi 
plus packaging signal as discussed above and concentration of viral stocks. In 

10 addition, the use of different envelope proteins, such as the G protein from 
vesicular-stomatitis virus has improved titres following concentration to 10 9 per 
ml (28). However, typically the envelope protein will be chosen such that the 
viral particle will preferentially infect cells that are infected with the virus which 
it desired to treat. For example where an HIV vector is being used to treat HIV 

15 infection, the env protein used will be the HIV env protein. 

The process of producing a retroviral vector in which the envelope protein is not 
the native envelope of the retrovirus is known as "pseudotyping". Certain 
envelope proteins, such as MLV envelope protein and vesicular stomatitis virus 
20 G (VSV-G) protein, pseudotype retroviruses very well. Pseudotyping is not a 
new phenomenon and examples may be found in WO-A-98/05759, WO-A- 
98/05754, WO-A-97/17457, WO-A-96/09400, WO-A-91/00047 and (29). 

As used herein, the term "high titre" means an effective amount of a retroviral 
25 vector or particle which is capable of transducing a target site such as a cell. 

As used herein, the term "effective amount" means an amount of a regulated 
retroviral or lentiviral vector or vector particle which is sufficient to induce 
expression of an NOI at a target site. 

30 
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Preferably the titre is from at least 10 6 retrovirus particles per ml, such as from 
10 6 to 10 7 per ml, more preferably at least 10 7 retrovirus particles per ml. . +t 



V it is possible to mamf 
5 genome or the regulated retroviral vector nucleotide sequence, so that viral genes 
are replaced or supplemented with one or more NOIs which may be heterologous 
NOIs. 

The term "heterologous" refers to a nucleic acid, sequence or protein sequence 
10 linked to a nucleic acid or protein sequence which it is not 



i.e. nucleotide sequence ot ml* 



a 

complete naturally occurring DNA sequence. Thus, the DNA sequence can be, 
15 for example, a synthetic DNA sequence, a recombinant DNA sequence (i.e. 
prepared by use of recombinant DNA techniques), a cDNA sequence or a partial 
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Many different selectable markers have been used successfully in retroviral 
vectors. These are reviewed in "Retroviruses" (1997 Cold Spring Harbour 
Laboratory Press Eds: JM Coffin, SM Hughes, HE Varmus pp 444) and include, 
but are not limited to, the bacterial neomycin (rieo) and hygromycin 
5 phosphotransferase genes which confer resistance to G418 and hygromycin 
respectively; a mutant mouse dihydrofolate reductase gene which confers 
resistance to methotrexate; the bacterial gpt gene which allows cells to grow in 
medium containing mycophenolic acid, xanthine and aminopterin; the bacterial 
hisD gene which allows cells to grow in medium without histidine but containing 

10 histidinol; the multidrug resistance gene (mdr) which confers resistance to a 
variety of drugs; and the bacterial genes which confer resistance to puromycin or 
phleomycin. All of these markers are dominant selectable and allow chemical 
selection of most cells expressing these genes. Other selectable markers are not 
dominant in that their use must be in conjunction with a cell line that lacks the 

15 relevant enzyme activity. Examples of non-dominant selectable markers include 
the thymidine kinase (tk) gene which is used in conjunction with tk cell lines. 

Particularly preferred markers are blasticidin and neomycin, optionally operably 
linked to a thymidine kinase coding sequence typically under the transcriptional 
20 control of a strong viral promoter such the SV40 promoter. 

In accordance with the present invention, suitable NOI sequences include those 
that are of therapeutic and/or diagnostic application such as, but are not limited 
to: sequences encoding cytokines, chemokines, hormones, antibodies, engineered 

25 immunoglobulin-like molecules, a single chain antibody, fusion proteins, 
enzymes, immune co-stimulatory molecules, immunomodulatory molecules, anti- 
sense RNA, a transdominant negative mutant of a target protein, a toxin, a 
conditional toxin, an antigen, a tumour suppressor protein and growth factors, 
membrane proteins^ vasoactive proteins and peptides, anti-viral proteins and 

30 ribozymes, and derivatives therof (such as with an associated reporter group). 
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When included, such coding sequences may be typically operatively linked to a 
suitable promoter, which may be a promoter driving expression of a ribozyme(s), 
or a different promoter or promoters, such as in one or more specific cell types. 

5 Suitable NOIs for use in the invention in the treatment or prophylaxis of cancer 
include NOIs encoding proteins which: destroy the target cell (for example a 
ribosomal toxin), act as: tumour suppressors (such as wild-type p53); activators 
of anti-tumour immune mechanisms (such as cytokines, co-stimulatory molecules 
and immunoglobulins); inhibitors of angiogenesis; or which provide enhanced 
10 drug sensitivity (such as pro-drug activation enzymes); indirectly stimulate 
destruction of target cell by natural effector cells (for example, strong antigen to 
stimulate the immune system or convert a precursor substance to a toxic 
substance which destroys the target cell (for example a prodrug activating 
enzyme). 

15 

Examples of prodrugs include but are not limited to etoposide phosphate (used 
with alkaline phosphatase; 5-fluorocytosine (with cytosine deaminase); 
Doxorubm-N-p-hydroxyphenoxyacetamide (with Penicillin- V^Amidase); Para-N- 
bis (2-chloroethyl)aminobenzoyl glutamate (with Carboxypeptidase G2); 
20 Cephalosporin nitrogen mustard carbamates (with B-lactamase); SR4233 (with 
p450 reductase); Ganciclovir (with HSV thymidine kinase); mustard pro-drugs 
with nitroreductase and cyclophosphamide or ifosfamide (with cytochrome 
p450). 

25 Suitable NOIs for use in the treatment or prevention of ischaemic heart disease 
include NOIs encoding plasminogen activators. Suitable NOIs for the treatment 
or prevention of rheumatoid arthritis or cerebral malaria include genes encoding 
anti-inflammatory proteins, antibodies directed against tumour necrosis factor 
(TNF) alpha, and anti-adhesion molecules (such as antibody molecules or 

30 receptors specific for adhesion molecules). 
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The expression products encoded by the NOIs may be proteins which are 
secreted from the cell. Alternatively the NOI expression products are not 
secreted and are active within the cell. In either event, it is preferred for the NOI 
5 expression product to demonstrate a bystander effect or a distant bystander effect; 
that is the production of the expression product in one cell leading to the killing 
of additional, related cells, either neighbouring or distant (e.g. metastatic), which 
possess a common phenotype. Encoded proteins could also destroy bystander 
tumour cells (for example with secreted antitumour antibody-ribosomal toxin 

10 fusion protein), indirectly stimulated destruction of bystander tumour cells (for 
example cytokines to stimulate the immune system or procoagulant proteins 
causing local vascular occlusion) or convert a precursor substance to a toxic 
substance which destroys bystander tumour cells (eg an enzyme which activates a 
prodrug to a diffusible drug). Also, the delivery of NOI(s) encoding antisense 

15 transcripts or ribozymes which interfere with expression of cellular genes for 
tumour persistence (for example against aberrant myc transcripts in Burkitts 
lymphoma or against bcr-abl transcripts in chronic myeloid leukemia. The use of 
combinations of such NOIs is also envisaged. 

The NOI or NOIs of the present invention may also comprise one or more 
cytokine-encoding NOIs. Suitable cytokines and growth factors include but are 
not limited to: ApoE, Apo-SAA, BDNF, Cardiotrophin-1, EGF, ENA-78, 
Eotaxin, Eotaxin-2, Exodus-2, FGF-acidic, FGF-basic, fibroblast growth factor- 
10 (30). FLT3 ligand, Fractalkine (CX3C), GDNF, G-CSF, GM-CSF, GF-pi, 
insulin, IFN-y 3 IGF-I, IGF-II, IL-la, IL-lp, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL- 
8 (72 a.a.), IL-8 (77 a.a.), IL-9, IL-10, IL-11, IL-12, IL-13, IL-15, IL-16, IL-17, 
IL-18 (IGIF), Inhibin a, Inhibin (}, IP-10, keratinocyte growth factor-2 (KGF-2), 
KGF, Leptin, LIF, Lymphotactin, Mullerian inhibitory substance, monocyte 
colony inhibitory factor, monocyte attractant protein (30 ibid), M-CSF, MDC (67 
a.a.), MDC (69 a.a.), MCP-1 (MCAF), MCP-2, MCP-3, MCP-4, MDC (67 a.a.), 
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MDC (69 a.a.), MIG, MIP-lcc, MIP-lp, MIP-3a, MIP-3P, MIP-4, myeloid 
progenitor inhibitor factor- 1 (MPIF-1), NAP-2, Neurturin, Nerve growth factor, 
P-NGF, NT-3, NT-4, Oncostatin M, PDGF-AA, PDGF-AB, PDGF-BB, PF-4, 
RANTES, SDFla, SDFlp, SCF, SCGF, stem cell factor (SCF), TARC, TGF-a, 
5 TGF-P, TGF-P2, TGF-p3, tumour necrosis factor (TNF), TNF-a, TNF-P, TNIL- 
1, TPO, VEGF, GCP-2, GRO/MGSA, GRO-P, GRO-y, HCC1, 1-309. 

The NOI or NOIs may be under the expression control of an expression 
regulatory element, such as a promoter and/or a promoter enhancer as known as 
■ . 10 "responsive elements" in the present invention. 

When the regulated retroviral vector particles are used to transfer NOIs into cells 
which they transduce, such vector particles also designated "viral delivery 
systems" or "retroviral delivery systems". Viral vectors, including retroviral 

15 vectors, have been used to transfer NOIs efficiently by exploiting the viral 
transduction process. NOIs cloned into the retroviral genome can be delivered 
efficiently to cells susceptible to transduction by a retrovirus. Through other 
genetic manipulations, the replicative capacity of the retroviral genome can be 
destroyed. The vectors introduce new genetic material into a cell but are unable 

20 to replicate. 

The regulated retroviral vector of the present invention can be delivered by viral 
or non-viral techniques. Non-viral delivery systems include but are not limted to 
DNA transfection methods. Here, transfection includes a process using a non- 
25 viral vector to deliver a gene to a target mammalian cell. 

Typical transfection methods include electroporation, DNA biolistics, lipid- 
mediated transfection, compacted DNA-mediated transfection, liposomes, 
immunoliposomes, lipofectin, cationic agent-mediated, cationic facial 
30 amphiphiles (CFAs) (31), multivalent cations such as spermine, cationic lipids or 
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polylysine, 1, 2,-bis (oleoyloxy)-3-(trimethylammonio) propane (DOTAP)- 
cholesterol complexes (32) and combinations thereof. 

Viral delivery systems include but are not limited to adenovirus vector, an adeno- 
5 associated viral (AAV) vector, a herpes viral vector, a retroviral vector, a 
lentiviral vector, or a baculoviral vector. These viral delivery systems may be 
configured as a split-intron vector. A split intron vector is described in WO 
99/15683. 

10 Other examples of vectors include ex vivo delivery systems, which include but 
are not limited to DNA transfection methods such as electroporation, DNA 
biolistics, lipid-mediated transfection, compacted DNA-mediated transfection. 

The vector may be a plasmid DNA vector. Alternatively, the vector may be a 
15 recombinant viral vector. Suitable recombinant viral vectors include adenovirus 
vectors, adeno-associated viral (AAV) vectors, Herpes-virus vectors, or retroviral 
vectors, lentiviral vectors or a combination of adenoviral and lentiviral vectors. 
In the case of viral vectors, gene delivery is mediated by viral infection of a target 
cell. 

20 

If the features of adenoviruses are combined with the genetic stability of 
retro/lentiviruses then essentially the adenovirus can be used to transduce target 
cells to become transient retroviral producer cells that could stably infect 
neighbouring cells. 

25 

The present invention also provides a pharmaceutical composition for treating an 
individual by gene therapy, wherein the composition comprises a therapeutically 
effective amount of a regulated retroviral vector according to the present 
invention. The pharmaceutical composition may be for human or animal usage. 
30 Typically, a physician will determine the actual dosage which will be most 
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suitable for an individual subject and it will vary with the age, weight and 
response of the particular patient. 

The composition may optionally comprise a pharmaceutical^ acceptable carrier, 
5 diluent, excipient or adjuvant. The choice of pharmaceutical carrier, excipient or 
diluent can be selected with regard to the intended route of administration and 
• standard pharmaceutical practice. The pharmaceutical compositions may 
comprise as - or in addition to - the carrier, excipient or diluent any suitable 
binder(s), lubricant(s), suspending agent(s), coating agent(s), solubilising 
10 agent(s), and other carrier agents that may aid or increase the viral entry into the 
target site (such as for example a lipid delivery system). 

Where appropriate, the pharmaceutical compositions can be administered by any 
one or more of: minipumps, inhalation, in the form of a suppository or pessary, 

15 topically in the form of a lotion, solution, cream, ointment or dusting powder, by 
use of a skin patch, orally in the form of tablets containing excipients such as 
starch or lactose, or in capsules or ovules either alone or in admixture with 
excipients, or in the form of elixirs, solutions or suspensions containing 
flavouring or colouring agents, or they can be injected parenterally, for example 

20 intracavernosally, intravenously, intramuscularly or subcutaneously. For 
parenteral administration, the compositions may be best used in the form of a 
sterile aqueous solution which may contain other substances, for example enough 
salts or monosaccharides to make the solution isotonic with blood. For buccal or 
sublingual administration the compositions may be administered in the form of 

25 tablets or lozenges which can be formulated in a conventional manner. 

The present invention is believed to have a wide therapeutic applicability - 
depending on inter alia the selection of the one or more NOIs. 
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For example, the present invention may be useful in the treatment of the disorders 
listed in WO-A-98/05635. For ease of reference, part of that list is now provided: 
cancer, inflammation or inflammatory disease, dermatological disorders, fever, 
cardiovascular effects, haemorrhage, coagulation and acute phase response, 
5 cachexia, anorexia, acute infection, HIV infection, shock states, graft- versus-host 
reactions, autoimmune disease, reperfusion injury, meningitis, migraine and 
aspirin-dependent anti-thrombosis; tumour growth, invasion and spread, 
angiogenesis, metastases, malignant, ascites and malignant pleural effusion; 
cerebral ischaemia, ischaemic heart disease, osteoarthritis, rheumatoid arthritis, 

10 osteoporosis, asthma, multiple sclerosis, neurodegeneration, Alzheimer's disease, 
atherosclerosis, stroke, vasculitis, Crohn's disease and ulcerative colitis; 
periodontitis, gingivitis; psoriasis, atopic dermatitis, chronic ulcers, epidermolysis 
bullosa; corneal ulceration, retinopathy and surgical wound healing; rhinitis, 
allergic conjunctivitis, eczema, anaphylaxis; restenosis, congestive heart failure, 

15 endometriosis, atherosclerosis or endosclerosis. 

In addition, or in the alternative, the present invention may be useful in the 
treatment of disorders listed in WO-A-98/07859. For ease of reference, part of 
that list is now provided: cytokine and cell proliferation/differentiation activity; 

20 immunosuppressant or immunostimulant activity (e.g. for treating immune 
deficiency, including infection with human immune deficiency virus; regulation 
of lymphocyte growth; treating cancer and many autoimmune diseases, and to 
prevent transplant rejection or induce tumour immunity); regulation of 
haematopoiesis, e.g. treatment of myeloid or lymphoid diseases; promoting 

25 growth of bone, cartilage, tendon, ligament and nerve tissue, e.g. for healing 
wounds, treatment of burns, ulcers and periodontal disease and 
neurodegeneration; inhibition or activation of follicle-stimulating hormone 
(modulation of fertility); chemotactic/chemokinetic activity (e.g. for mobilising 
specific cell types to sites of injury or infection); haemostatic and thrombolytic 

30 activity (e.g. for treating haemophilia and stroke); antiinflammatory activity (for 
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treating e.g. septic shock or Crohn's disease); as antimicrobials; modulators of 
e.g. metabolism or behaviour; as analgesics; treating specific deficiency 
disorders; in treatment of e.g. psoriasis, in human or veterinary medicine. 

5 In addition, or in the alternative, the present invention may be useful in the 
treatment of disorders listed in WO-A-98/09985. For ease of reference, part of 
that list is now provided: macrophage inhibitory and/or T cell inhibitory activity 
and thus, anti-inflammatory activity; anti-immune activity, i.e. inhibitory effects 
against a cellular and/or humoral immune response, including a response not 

10 associated with inflammation; inhibit the ability of macrophages and T cells to 
adhere to extracellular matrix components and fibronectin, as well as up- 
regulated fas receptor expression in T cells; inhibit unwanted immune reaction 
and inflammation including arthritis, including rheumatoid arthritis, inflammation 
associated with hypersensitivity, allergic reactions, asthma, systemic lupus 

15 erythematosus, collagen diseases and other autoimmune diseases, inflammation 
associated with atherosclerosis, arteriosclerosis, atherosclerotic heart disease, 
reperfusion injury, cardiac arrest, myocardial infarction, vascular inflammatory 
disorders, respiratory distress syndrome or other cardiopulmonary diseases, 
inflammation associated with peptic ulcer, ulcerative colitis and other diseases of 

20 the gastrointestinal tract, hepatic fibrosis, liver cirrhosis or other hepatic diseases, 
thyroiditis or other glandular diseases, glomerulonephritis or other renal and 
urologic diseases, otitis or other oto-rhino-laryngological diseases, dermatitis or 
other dermal diseases, periodontal diseases or other dental diseases, orchitis or 
epididimo-orchitis, infertility, orchidal trauma or other immune-related testicular 

25 diseases, placental dysfunction, placental insufficiency, habitual abortion,- 
eclampsia, pre-eclampsia and other immune and/or inflammatory-related 
gynaecological diseases, posterior uveitis, intermediate uveitis, anterior uveitis, 
conjunctivitis, chorioretinitis, uveoretinitis, optic neuritis, intraocular 
inflammation, e.g. retinitis or cystoid macular oedema, sympathetic ophthalmia, 

30 scleritis, retinitis pigmentosa, immune and inflammatory components of 
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degenerative fondus disease, inflammatory components of ocular trauma, ocular 
inflammation caused by infection, proliferative vitreo-retinopathies, acute 
ischaemic optic neuropathy, excessive scarring, e.g. following glaucoma filtration 
operation, immune and/or inflammation reaction against ocular implants and 
5 other immune and inflammatory-related ophthalmic diseases, inflammation 
associated with autoimmune diseases or conditions or disorders where, both in 
the central nervous system (CNS) or in any other organ, immune and/or 
inflammation suppression would be beneficial, Parkinson's disease, complication 
and/or side effects from treatment of Parkinson's disease, AIDS-related dementia 

10 complex HIV-related encephalopathy, Devic's disease, Sydenham chorea, 
Alzheimer's disease and other degenerative diseases, conditions or disorders of 
the CNS, inflammatory components of stokes, post-polio syndrome, immune and 
inflammatory components of psychiatric disorders, myelitis, encephalitis, 
subacute sclerosing pan-encephalitis, encephalomyelitis, acute neuropathy, 

15 subacute neuropathy, chronic neuropathy, Guillaim-Barre syndrome, Sydenham 
chora, myasthenia gravis, pseudo-tumour cerebri, Down's Syndrome, 
Huntington's disease, amyotrophic lateral sclerosis, inflammatory components of 
CNS compression or CNS trauma or infections of the CNS, inflammatory 
components of muscular atrophies and dystrophies, and immune and 

20 inflammatory related diseases, conditions or disorders of the central and 
peripheral nervous systems, post-traumatic inflammation, septic shock, infectious 
diseases, inflammatory complications or side effects of surgery, bone marrow 
transplantation or other transplantation complications and/or side effects, 
inflammatory and/or immune complications and side effects of gene therapy, e.g. 

25 due to infection with a viral carrier, or inflammation associated with AIDS, to 
suppress or inhibit a humoral and/or cellular immune response, to treat or 
ameliorate monocyte or leukocyte proliferative diseases, e.g. leukaemia, by 
reducing the amount of monocytes or lymphocytes, for the prevention and/or 
treatment of graft rejection in cases of transplantation of natural or artificial cells, 

30 tissue and organs such as cornea, bone marrow, organs, lenses, pacemakers, 
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natural or artificial skin tissue. 

The invention will now be further described by way of Examples, which are meant 
to serve to assist one of ordinary skill in the art in carrying out the invention and 
5 are not intended in any way to limit the scope of the invention. The Examples refer 
to the Figures. In the Figures: 

Description of the Figures 

10 Figure 1 shows schematically how to create a suitable 3 ' LTR by PCR; 

Figure 2 shows the codon usage table for wild type HIV gag-pol of strain HXB2 
(accession number: K03455); 

1 5 Figure 3a shows the codon usage table of the codon optimised sequence designated 
gagpol-SYNgp. Figure 3b shows a comparative codon usage table; 

Figure 4 shows the codon usage table of the wild type HIV env called env-mn; 

20 Figure 5 shows the codon usage table of the codon optimised sequence of HIV env 
designated SYNgpl60mn; 

Figure 6 shows two plasmid constructs for use in the invention; 

25 Figure 7 shows the principle behind two systems for producing retroviral vector 
particles; 

Figure 8 shows a sequence comparison between the wild type HIV gag-pol 
sequence (pGP-RRE3) and the codon optimised gag-pol sequence (pSYNGP); 

30 
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Figure 9 shows a sequence comparison between the wild type EIAV gag-pol 
sequence (WT) and the codon optimised gag-pol sequence (CO); 

Figure 1 0 shows Rev independence of protein expression particle formation; 

5 

Figure 1 1 shows translation rates of wild-type (WT) and codon optimised gag-pol; 

Figure 12 shows gag-pol mRNA levels in total and cytoplasmic fractions; 

10 Figure 13 shows the effect of insertion of WT gag downstream of the codon 
optimised gene on RNA and protein levels; 

Figure 14 shows the plasmids used to study the effect of HTV-1 gag on the codon 
optimised gene; 

15 

Figure 15 shows the effect on cytoplasmic RNA of insertion of HTV-1 gag 
upstream of the codon optimised gene; 

Figure 16 shows the effect of Leptomycin B (LMB) on protein production; . 

20 

Figure 17 shows the cytoplasmic RNA levels of the vector genomes; 
Figure 18 shows transduction efficiency at MOI 1; 
25 Figure 1 9 shows a schematic representation of pGP-RRE3 ; 
Figure 20 shows a schematic representation of pSYNGP; 
Figure 21 shows vector titres generated with different gag-pol constructs; 

30 
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Figure 22 shows vector titres from the Rev/RRE (-) and (+) genomes; 

Figure 23 shows vector titres from the pHS series of vector genomes; 

5 Figure 24 shows vector titres for the pHS series of vector genomes in the presence 
or absence of Rev/RRE; 

Figure 25 shows an analysis of gag-pol constructs; 
1 0 Figure 26 shows a Western blot of 293T extracts; 

Figure 27 is a schematic representation of pESYNGP; 
Figure 28 is a schematic representation of LpESYNGP; 

15 

Figure 29 is a schematic representation of LpESYNGPRRE; 
Figure 30 is a schematic representation of pESYNGPRRE; 
20 Figure 3 1 is a schematic representation of pONY4.0Z; 
Figure 32 is a schematic representation of pONY8.0Z; 
Figure 33 is a schematic representation of pONY8.1Z; 

25 

Figure 34 is a schematic representation of pONY3. 1 ; 
Figure 35 is a schematic representation of pCIneoERev; 
30 Figure 36 is a schematic representation of pES YNREV; 
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Figures 37 and 38 show the effect of different vector constructs on viral vector 
titres; 

5 Figures 39 and 40 show the effect of different vector constructs on RT activity; 
Figure 41 shows the effect of the 5 5 leader sequence on viral vector titre; 
Figure 42 shows viral vector titres when using pONY8.1Z; 

10 

Figure 43 shows a comparison between the sequences of pONY3.1 and codon 
optimised pONY3.20PTI in the first 372 nucelotides of gag; 

Figure 44 is a schematic representation of pIRESlhygESYNGP; 

15 

Figure 45 and 46 show the results of experiments to confirm that codon optimised 
gag-pol can be used in the production of packaging and producer cell lines; 

Figures 47 and 48 show the results of experiments which confirm that RNA from 
20 codon optimised gag-pol is packaged less efficiently than that from the wild-type 
gene; 

Figure 49 shows the results of an experiment which confirms that expression from 
pESYNGP and pESDSYNGP are similar; 

25 

Figure 50 is a schematic representation of pESDSYNGP; and 

Figure 51 shows the results of an experiment which confirms that the efficiency of 
encapsidating gag-pol RNA in PEV-17 cells and B-241 cells in similar. 

30 
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In more detail, Figure 8 shows a sequence comparison between the wild type HIV 
gag-pol sequence (pGP-RRE3) and the codon optimised gag-pol sequence 
(pSYNGP) wherein the upper sequence represents pS YNGP and the lower sequence 
represents pGP-RRE3. 

5 

Figure 10 shows Rev independence of protein expression particle formation. 5jj.g of 
the gag-pol expression plasmids were transfected into 293T cells in the presence or 
absence of Rev (pCMV-Rev, ljig) and protein levels were determined 48 hours post 
transfection in culture supernatants (A) and cell lysates (B). HTV-1 positive human 

10 serum was used to detect the gag-pol proteins. The blots were re-probed with an 
anti-actin antibody, as an internal control (C). The protein marker (New England 
Biolabs) sizes (in kDa) are shown on the side of the gel. Lanes: 1 . Mock transfected 
293T cells, 2. pGP-RRE3, 3. pGP-RRE3 + pCMV-Rev, 4. pSYNGP, 5. pSYNGP + 
pCMV-Rev, 6. pSYNGP-RRE, 7. pSYNGP-RRE +pCMV-Rev, 8. pSYNGP-ERR, 

15 9.pSYNGP-ERR + pCMV-Rev. 

Figure 1 1 shows translation rates of WT and codon optimised gag-pol. 293T cells 
were transfected with 2\ig pGP-RRE3 (+/- 1-p.g pCMV-Rev) or 2^ig pSYNGP. 
Protein samples from culture supernatants (A) and cell extracts (B) were analysed 

20 by Western blotting 12, 25, 37 and 48 hours post-transfection. HTV-1 positive 
human serum was used to detect gag-pol proteins (A, B) and an anti-actin 
antibody was used as an internal control (C). The protein marker sizes are shown 
on the side of the gel (in kD). A Phosphorimager was used for quantification of 
the results. Lanes: 1. pGP-RRE3 12h, 2. pGP-RRE3 25h, 3. pGP-RRE3 37h, 4. 

25 pGP-RRE3 48h, 5. pGP-RRE3 + pCMV-Rev 12h, 6. pGP-RRE3 + P CMV-Rev 
25h, 7. pGP-RRE3 + pCMV-Rev 37h, 8. pGP-RRE3 + pCMV-Rev 48h, 9. 
pSYNGP 12h, 10. pSYNGP 25h, 11. pSYNGP 37h, 12. pSYNGP 48h, 13. Mock 
transfected 293 T cells. 
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Figure 12 shows gag-pol mRNA levels in total and cytoplasmic fractions. Total 
and cytoplasmic RNA was extracted from 293 T cells 36 hours after transfection 
with 5|ig of the gag-pol expression plasmid (+/- lug pCMV-Rev) and mRNA 
levels were estimated by Northern blot analysis. A probe complementary to nt 
5 1222-1503 of both the wild type and codon optimised gene was used. Panel A 
shows the band corresponding to the HIV-1 gag-pol The sizes of the mRNAs are 
4.4 kb for the codon optimised and 6 kb for the wild type gene. Panel B shows 
the band corresponding to human ubiquitin (internal control for normalisation of 
results). Quantification was performed using a Phosphorimager. Lane numbering: 
10 c indicates cytoplasmic fraction and t indicates total RNA fraction. Lanes: 1. 
pGP-RRE3 , 2. pGP-RRE3 + pCMV-Rev, 3. pSYNGP, 4. pSYNGP + pCMV- 
Rev, 5. pSYNGP-RRE, 6. pSYNGP-RRE + pCMV-Rev, 7. Mock transfected 
293T ceils, 8. pGP-RRE3 + pCMV-Rev, 9. Mock transfected 293T cells, 10. 
pSYNGP. 

15 

Figure 13 shows the effect of insertion of WT gag downstream of the codon 
optimised gene on RNA and protein levels. The wt gag sequence was inserted 
downstream of the codon optimised gene in both orientations (NotI site), 
resulting in plasmids pSYN6 (correct orientation, see Figure 14) and pSYN7 

20 (reverse orientation, see Figure 14). The gene encoding for P-galactosidase 
(LacZ) was also inserted in the same site and the correct orientation (plasmid 
pSYN8, see Figure 14). 293T cells were transfected with 5 jig of each plasmid 
and 48 hours post transfection mRNA and protein levels were determined as 
previously described by means of Northern and Western blot analysis 

25 respectively. 

Northern blot analysis in cytoplasmic RNA fractions. The blot was probed with a 
probe complementary to nt 1510-2290 of the codon optimised gene (I) and was 
re-probed with a probe specific for human ubiquitin (II). Lanes: 1. pSYNGP, 2. 
30 pSYN8, 3. pSYN7, 4. pSYN6 
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Western blot analysis: HIV-1 positive human serum was used to detect the gag- 
pol proteins (I) and an anti-actin antibody was used as an internal control (II). 
Lanes: Cell lysates: 1. Mock transfected 293T cells, 2. pGP-RRE3 + pCMV-Rev, 
3. pSYNGP, 4. pSYN6, 5. pSYN7, 6. pSYN8. Supernatants: 7. Mock transfected 
5 293T cells, 8. pGP-RRE3 + pCMV-Rev, 9. pSYNGP, 10. pSYN6, 11. pSYN7, 
12. pSYN8. The protein marker (New England Biolabs) sizes are shown on the 
side of the gel. 

Figure 14 shows the plasmids used to study the effect of HTV-1 gag on the codon 
10 optimised gene. The backbone for all constructs was pCI-Neo. Syn gp: The 
codon optimised HIV-1 gag-pol gene. HXB2 gag: The wild type HTV-1 gag 
gene. HXB2 gagr: The wild type HTV-1 gag gene in the reverse orientation. 
HXB2 gag AATG: The wild type HIV-1 gag gene without the gag ATG. HXB2 
gag-fr.sh.: The wild type HTV-1 gag gene with a frameshift mutation. HXB2 gag 
15 625-1503: Nucleotides 625-1503 of the wild type HTV-1 gag gene. HXB2 gag 1- 
625: Nucleotides 1-625 of the wild type HTV-1 gag gene. 

Figure 15 shows the effect on cytoplasmic RNA of insertion of HIV-1 gag 

upstream of the codon optimised gene. Cytoplasmic RNA was extracted 48 hours 
20 post transfection of 293 T cells (5 [ig of each pSYN plasmid was used and 1 ^g of 

pCMV-Rev was co-transfected in some cases). The probe that was used was 

designed to be complementary to nt 1510-2290 of the codon optimised gene (I). 

A probe specific for human ubiquitin was used as an internal control (II). 

Lanes: 1. pSYNGP , 2. pSYN9, 3. pSYNIO, 4. pSYNIO + pCMV-Rev, 5. 
25 pSYNl 1, 6. pSYNl 1 + pCMV-Rev, 7. pCMV-Rev. 

Lanes: 1. pSYNGP , 2. pSYNGP-RRE, 3. pSYNGP-RRE + pCMV-Rev, 4. 

pSYN12, 5. pSYN14, 6. pSYN14 + pCMV-Rev, 7. pSYN13, 8. pSYN15, 9. 

pSYN17, 10. pGP-RRE3, 11. pSYN6, 12. pSYN9, 13. pCMV-Rev. 
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Figure 16 shows the effect of LMB on protein production. 293 T cells were 
transfected with l^ig pCMV-Rev and 3jxg of pGP-RRE3/pSYNGP/pSYNGP- 
RRE (+/- l|xg pCMV-Rev). Transfections were done in duplicate. 5 hours post 
transfection the medium was replaced with fresh medium in the first set and with 
5 fresh medium containing 7.5 nM LMB in the second. 20 hours later the cells 
were lysed and protein production was estimated by Western blot analysis. HIV-1 
positive human serum was used to detect the gag-pol proteins (A) and an anti- 
actin antibody was used as an internal control (B). Lanes: 1. pGP-RRE3, 2. pGP- 
RRE3 + LMB, 3. pGP-RRE3 + pCMV-Rev, 4. pGP-RRE3 + pCMV-Rev + 
10 LMB, 5. pSYNGP, 6. pSYNGP + LMB, 7. pSYNGP + pCMV-Rev, 8. pSYNGP 
+ pCMV-Rev + LMB, 9. pSYNGP-RRE, 10. pSYNGP-RRE + LMB, 11. 
pSYNGP-RRE + pCMV-Rev, 12. pSYNGP-RRE + pCMV-Rev + LMB. 

Figure 17 shows the cytoplasmic RNA levels of the vector genomes. 293T cells 
15 were transfected with 10 jxg of each vector genome. Cytoplasmic RNA was 
extracted 48 hours post transfection. 20 jxg of RNA were used from each sample 
for Northern blot analysis. The 700bp probe was designed to hybridise to all 
vector genome RNAs (see Materials and Methods). Lanes: 1 . pH6nZ, 2. pH6nZ + 
pCMV-Rev, 3. pH6.1nZ, 4. pH6.1nZ + pCMV-Rev, 5. pHSlnZ, 6. pHS2nZ, 7. 
20 pHS3nZ, 8. pHS4nZ, 9. pHS5nZ, 10. pHS6nZ, 11. pHS7nZ, 12. pHS8nZ, 13. 
pCMV-Rev. 

Figure 18 shows transduction efficiency at MOI 1. Viral stocks were generated 
by co-transfection of each gag-pol expression plasmid (5 or 0.5 \ig), 15 fjig 

25 pH6nZ or pHS3nZ (vector genome plasmid) and 5 \ig pHCMVG (VSV envelope 
expression plasmid) on 293T cells. Virus was concentrated as previously 
described (45) and transduction efficiency was determined at m.o.i.'s 0.01-1 on 
HT1080 cells. There was a linear correlation of transduction efficiency and m.o.i. 
in all cases. An indicative picture at m.o.i. 1 is shown here. Transduction 

30 efficiency was >80% with either genome, either gag-pol and either high or low 
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amounts of pSYNGP. Titres before concentration (I.U./ml): on 293T cells: A. 
6.6xl0 5 , B. 7.6xl0 5 , C. 9.2x10 s , D. 1.5xl0 5 , on HT1080 cells: A. 6.0xl0 4 , B. 
9.9xl0 4 , C. 8.0xl0 4 , D. 2.9xl0 4 . Titres after concentration (LU./ml) on HT1080 
cells: A. 6.0x10 s , B. 2.0xl0 6 , C. 1.4xl0 6 , D. 2.0x10 s . 

5 

Figure 21 shows vector titers obtained with differed gag-pol constructs. Viral 
stocks were generated by co-transfection of each gag-pol expression plasmid, 
pH6nZ (vector genome plasmid) and pHCMVG (VSV envelope expression 
plasmid, 2.5jxg for each transfection) on 293T cells. Titres (LU./ml of virs stock) 
10 were measured on 293T cells by counting the number of blue colonies following 
X-Gal staining 48 hours after transduction. Experiments were performed at least 
twice and the variation between experiments was less than 15%. 

Figure 22 shows vector titres from the Rev/RRE (-) and (+) genomes. The retroviral 
15 vectors were generated as described in the Examples. Titres (I.U./ml of viral stock + 
SD) were determined in 293T cells. 

Figure 23 shows vector titres from the pHS series of vector genomes. The retroviral 
vector was generated as described in the Examples. Titres (I.U./ml of viral stock + 
20 SD) were determined in 293T cells. Rev is provided from pCMV-Rev. Note that 
pH6nZ expresses Rev and contains the RRE. None of the other genomes express 
Rev or contain the RRE. Expression from pSYNGP is Rev independent, whereas it 
is Rev dependent for pGP-RRE3 . 

25 Figure 24 shows vector titres for the pHS series of vector genomes in the presence 
or absence of Rev/RRE. The retroviral vector was generated as described in the 
Examples. 5 jxg of vector genome, 5 \xg of pSYNGP and 2.5 p.g of pHCMVG were 
used and titres (I.U./ml) were determined in 293T cells. Experiments were 
performed at least twice and the variation between experiments was less than 15%. 

30 Rev is provided from pCMV-Rev (1 |ig). Note that pH6nZ expresses Rev and 
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contains the RRE. None of the pHS genomes expresses Rev and only pHSlnZR, 
pHS3nZR, pHS7nZR and pH6.1nZR contain the RRE. gag-pol expression from 
pSYNGP is Rev independent. 

5 Figure 26 shows a Western blot of 293T extracts wherein 30:g of total cellular 
protein was separated by SDS/Page electrophoresis, transferred to nitro-cellulose 
and probed with anti EIAV antibodies. The secondary antibody was anti-Horse 
HRP (Sigma). 

10 In Figure 38 the titres are shown in lacZ forming units (L.F.U.)/ml. The vectors 
used are indicated in boxes above the bars. 

For ease of reference, we also set out the sequences listed in the accompanying 
Sequence Listing: 

SEQ ID NO:l shows the sequence of the wild-type gag-pol sequence for the strain 
1 5 HXB2 (accession no. K03455); 

SEQ ID NO:2 shows the sequence of pSYNGP; 

SEQ ID NO:3 shows (he sequence of the Envelope gene for HIV-1 MN (Genbank 
accession no. Ml 7449); 

SEQ ID NO:4 shows the sequence of SYNgp-160mn - codon optimised env 
20 sequence; 

SEQ ID NO:5 shows the sequence of pESYNGP; 

SEQ ID NO:6 shows the sequence of LpESYNGP; 

SEQ ID NO:7 shows the sequence of pESYNGPRRE; 

SEQ ID NO:8 shows the sequence of LpESYNGPRRE; 
25 SEQ ID NO:9 shows the sequence of pONY4.0Z; 

SEQ ID NO:10 shows the sequence of pONY8.0Z; 

SEQ ID NO: 1 1 shows the sequence of pONY8.1Z; 

SEQ ID NO:12 shows the sequence of pONY3.1; 

SEQ ID NO; 13 shows the sequence of pCIneoERev; 
30 SEQ ID NO: 14 shows the sequence of pESYNREV; 
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SEQ ID NO: 1 5 shows the sequence of codon optimised HTV gag-pol; 
SEQ ID NO: 16 shows the sequence of codon optimied EIAV gag-pol; 
SEQ ID NO: 17 shows the sequence of pIRESlhygESYNGP; 
SEQ ID NO: 18 shows the sequence of pESDSYNGP; and 
5 SEQ ID NO: 19 shows the sequence of pONY8.3GFB29(-). 

Example 1 - HIV 

Cell lines 

10 293T cells (33) and HeLa cells (34) were maintained in Dubecco's modified Eagle's 
medium containing 10% (v/v) fetal calf serum and supplemented with L-glutamine 
and antibiotics (penicillin-streptomycin). 293T cells were obtained from D. 
Baltimore (Rockefeller University). 

15 HIV- 1 proviral clones 

Proviral clones pWI3 (35) and pNL4-3 (36) were used. 
Construction of a Packaging System 

20 

In one of the present examples, a modified codon optimised HIV env sequence is 
used (SEQ LD. No. 4). The corresponding env expression plasmid is designated 
pSYNgpl60mn. The modified sequence contains extra motifs not used by (37). 
The extra sequences were taken from the HIV env sequence of strain MN and codon 
25 optimised. Any similar modification of the nucleic acid sequence would function 
similarly as long as it used codons corresponding to abundant tRNAs (38). 
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Codon optimised HIV-1 gag-pol gene 

A codon optimised gag-pol gene, shown from nt 1108 to 5414 of SEQ ED NO: 2 
was constructed by annealing a series of short overlapping oligonucleotides 
5 (approximately 30-40mers with 25% overlap, i.e. approximately 9 nucleotides). 
Oligonucleotides were purchased from R&D SYSTEMS (R&D Systems Europe 
Ltd, 4-10 The Quadrant, Barton Lane, Abingdon, OX14 3YS, UK). Codon 
optimisation was performed using the sequence of HXB-2 strain (AC: K03455) 
(39). The Kozak consensus sequence for optimal translation initiation (40) was also 

10 included. A fragment from base 1222 from the beginning of gag until the end of gag 
(1503) was not optimised in order to maintain the frameshift site and the overlap 
between the gag and pol reading frames. This was from clone pNL4-3. (When 
referring to base numbers within the gag-pol gene base 1 is the A of the gag ATG, 
which corresponds to base 790 from the beginning of the HXB2 sequence. When 

15 referring to sequences outside the gag-pol then the numbers refer to bases from the 
beginning of the HXB2 sequence, where base 1 corresponds to the beginning of the 
5' LTR). Some deviations from optimisation were made in order to introduce 
convenient restriction sites. The final codon usage is shown in Figure 3b, which 
now resembles that of highly expressed human genes and is quite different from that 

20 of the wild type HIV-1 gag-pol The gene was cloned into the mammalian 
expression vector pCIneo (Promega) in the EcoRI-Nofi sites. The resulting plasmid 
was named pSYNGP (Figure 20, SEQ ID No 2). Sequencing of the gene in both 
strands verified the absence of any mistakes. A sequence comparison between the 
codon optimised and wild type HTV gag-pol sequence is shown in Figure 8. 

25 

Rev/RRE constructs 

The HIV-1 RRE sequence (bases 7769-8021 of the HXB2 sequence) was amplified 
by PCR from pWD proviral clone with primers bearing the Notl restriction site and 
30 was subsequently cloned into the Notl site of pSYNGP. The resulting plasmids 
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were named pSYNGP-RRE (RRE in the correct orientation) and pSYNGP-ERR 
(RRE in the reverse orientation). 

Pseudotyped viral particles 

5 

In one form of the packaging system a. synthetic gag-pol cassette is coexpressed 
with a heterologous envelope coding sequence. This could be for example VSV-G 
(44, 45), amphotropic MLV env (46, 47) or any other protein that would be 
incorporated into the HIV or EIAV particle (48). This includes molecules capable 
10 of targeting the vector to specific tissues. 

HTV-1 Vector genome constructs 

pH6nZ is derived from pH4Z (49) by the addition of a single nucleotide to place 
15 an extra guanine residue that was missing from pH4Z at the 5* end of the vector 
genome transcript to optimise reverse transcription. In addition the gene coding 
for (3-galactosidase (LacZ) was replaced by a gene encoding for a nuclear 
localising P-galactosidase. (We are grateful to Enca Martin-Rendon and Said 
Ismail for providing pH6nZ). In order to construct Rev(-) genome constructs the 
20 following modifications were made : a) A 1.8 kb Pstl - Pstl fragment was 
removed from pH6nZ, resulting in plasmid pH6.1nZ and b) an EcoNI (filled) - 
Sphl fragment was substituted with a Spel (filled) - Sphl fragment from the same 
plasmid (pH6nZ), resulting in plasmid pH6.2nZ. In both cases sequences within 
gag (nt 1-625) were retained, as they have been shown to play a role in packaging 
25 (93). Rev, RRE and any other residual env sequences were removed. pH6.2nZ 
further contains the env splice acceptor, whereas pH6.1nZ does not. 

A series of vectors encompassing further gag deletions plus or minus a mutant 
major splice donor (SD) (GT to CA mutation) were also derived from pH6Z. 
30 These were made by PCR with primers bearing a Narl (5' primers) and an Spel 
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(3* primers) site. The PGR products were inserted into pH6Z at the Narl - Spel 
sites. The resulting vectors were named pHSlnZ (containing HIV-1 sequences up 
to gag 40), pHS2nZ (containing HIV-1 sequences up to gag 260), pHS3nZ 
(containing HIV-1 sequences up to gag 360), pHS4nZ (containing HIV-1 
5 sequences up to gag 625), pHS5nZ (same as pHSlnZ but with a mutant SD), 
pHS6nZ (same as pHS2nZ but with a mutant SD), pHS7nZ (same as pHS3nZ but 
with a mutant SD) and pHS8nZ (same as pHS4nZ but with a mutant SD). 

In addition, the REE sequence (nt 7769-8021 of the HXB2 sequence) was 
10 inserted in the Spel (filled) site of pH6.1nZ, pHSlnZ, pHS3nZ and pHS7nZ 
resulting in plasmids pH6.1nZR, pHSlnZR, pHS3nZR and pHS7nZR 
respectively. 

Other modifications to the genome have been made including the generation of a 
15 SIN vector (by deletion of part of the 3' U3), the replacement of the LTRs with 
those from MLV or replacement of part of the 3'U3 with the MLV U3 region. 

Transient transfections, transductions and determination of viral titres 

20 These were performed as previously described (49, 50). Briefly, 293T cells were 
seeded on 6cm dishes and 24 hours later they were transiently transfected by 
overnight calcium phosphate treatment. The medium was replaced 12 hours post- 
transfection and unless otherwise stated supernatants were harvested 48 hours 
post-transfection, filtered (through 0.22 or 0.45 \xm filters) and titered by 

25 transduction of 293T cells. For this reason supernatant at appropriate dilutions of 
the original stock was added to 293T cells (plated onto 6 or 12 well plates 24 
hours prior to transduction). 8 p.g/ml Polybrene (Sigma) was added to each well 
and 48 hours post transduction viral titres were determined by X-gal staining. 
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Luminescent p-galactosidase (p-gal) assays 

These were performed on total cell extracts using a luminescent P-gal reporter 
system (CLONTECH). Untransfected 293T cells were used as negative control and 
5 293T cells transfected with pCMV-0-gal (CLONTECH) were used as positive 
control. 

RNA analysis 

10 Total or cytoplasmic RNA was extracted from 293T cells by using the RNeasy mini 
kit (QUIAGEN) 36-48 hours post-transfection. 5-10 jig of RNA was subjected to 
Northern blot analysis as previously described (51). Correct fractionation was 
verified by staining of the agarose gel. A probe complementary to bases 1222-1503 
of the gag-pol gene was amplified by PGR from HIV- 1 pNL4-3 proviral clone and 

15 was used to detect both the codon optimised and wild type gag-pol mRNAs. A 
second probe, complementary to nt 1510-2290 of the codon optimised gene was 
also amplified by PCR from plasmid pSYNGP and was used to detect the codon 
optimised genes only. A 732 bp fragment complementary to all vector genomes 
used in this study was prepared by an Spel-AvrR digestion of pH6nZ. A probe 

20 specific for ubiquitin (CLONTECH) was used to normalise the results. All probes 
were labelled by random labelling (STRATAGENE) with cc- 32 P dCTP (Amersham). 
The results were quantitated by using a Storm Phosphorlmager (Molecular 
Dynamics) and shown in Figure 12. In the total cellular fractions the 47S rRNA 
precursor could be clearly seen, whereas it was absent from the cytoplasmic 

25 fractions. As expected (52), Rev stimulates the cytoplasmic accumulation of wild 
type gag-pol mRNA (lanes lc and 2c). RNA levels were 10-20 fold higher for the 
codon optimised gene compared to the wild type one, both in total and cytoplasmic 
fractions (compare lanes 3t-2t, 3c-2c, 10c-8c). The RRE sequence did not 
significantly destabilise the codon optimised RNAs since RNA levels were similar 

30 for codon optimised RNAs whether or not they contained the RRE sequence 
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(compare lanes 3 and 5). Rev did not markedly enhance cytoplasmic accumulation 
of the codon optimised gag-pol mRNAs, even when they contained the RRE 
sequence (differences in RNA levels were less than 2-fold, compare lanes 3-4 or 5- 
6). 

5 

It appeared from a comparison of Figures 10 and 12 that all of the increase in 
protein expression from syngp could be accounted for by the increase in RNA 
levels. In order to investigate whether this was due to saturating levels of RNA in 
the cell, we transfected 0.1, 1 and 10 |ig of the wild type or codon optimised 
10 expression vectors into 293T cells and compared protein production. In all cases 
protein production was 10-fold higher for the codon optimised gene for the same 
amount of transfected DNA, while increase in protein levels was proportional to the 
amount of transfected DNA for each individual gene. It seems likely therefore that 

the enhanced expression of the codon optimised gene can be mainly attributed to the 

i 

1 5 enhanced RNA levels present in the cytoplasm and not to increased translation. 
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Protein analysis 

Total cell lysates were prepared from 293T cells 48 hours post-transfection 
(unless otherwise stated) with an alkaline lysis buffer. For extraction of proteins 
5 from cell supernatants the supernatant was first passed through a 0.22jxm filter 
and the vector particles were collected by centrifiigation of 1 ml of supernatant at 
21,000 g for 30 minutes. Pellets were washed with PBS and then re-suspended in 
a small volume (2-10 il) of lysis buffer. Equal protein amounts were separated on 
a SDS 10-12% (v/v) polyacrylamide gel. Proteins were transferred to 

10 nitrocellulose membranes which were probed sequentially with a 1:500 dilution 
of HIV-1 positive human serum (AIDS Reagent Project, ADP508, Panel E) and a 
1:1000 dilution of horseradish peroxidase labelled anti-human IgG (Sigma, 
AO 176). Proteins were visualised using the ECL or ECL-plus western blotting 
detection reagent (Amersham). To verify equal protein loading, membranes were 

15 stripped and re-probed with a 1:1000 dilution of anti-actin antibody (Sigma, 
A2066), followed by a 1 :2000 dilution of horseradish peroxidase labelled anti- 
rabbit IgG (Vector Laboratories, PI- 1000). 

Expression ofgag-pol gene products and vector particle production 

20 

The wild type gag-pol (pGP-RRE3 - Figure 19) (49), and codon optimised 
expression vectors (pSYNGP, pSYNGP-RRE and pSYNGP-ERR) were 
transiently transfected into 293T cells. Transfections were performed in the 
presence or absence of a Rev expression vector, pCMV-Rev (53), in order to 

25 assess Rev-dependence for expression. Western blot analysis was performed on 
cell lysates and supernatants to assess protein production. The results are shown 
in Figure 10. As expected (54), expression of the wild type gene is observed only 
when Rev is provided in trans (lanes 2 and 3). In contrast, when the codon 
optimised gag-pol was used, there was high level expression in both the presence 

30 and absence of Rev (lanes 4 and 5), indicating that in this system there was no 
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requirement for Rev. Protein levels were higher for the codon optimised gene 
than for the wild type gag-pol (compare lanes 4-9 with lane 3). The difference 
was more evident in the cell supernatants (approximately 10-fold higher protein 
levels for the codon optimised gene compared to the wild type one, quantitated 
5 by using a Phosphorlmager) than in the cell lysates. 

In previous studies where the RRE has been included in gag-pol expression 
vectors that had been engineered to remove INS sequences, inclusion of the RRE 
lead to a decrease in protein levels, that was restored by providing Rev in trans 
10 (55). In our hands, the presence of the RRE in the fully codon optimised gag-pol 
mRNA did not affect protein levels and provision of Rev in trans did not further 
enhance expression (lanes 6 and 7). 

In order to compare translation rates between the wild type and codon optimised 
15 gene, protein production from the wild type and codon optimised expression 
vector was determined at several time intervals post transfection into 293T cells. 
Protein production and particle formation was determined by Western blot 
analysis and the results are shown in Figure 11. Protein production and particle 
formation was 10-fold higher for the codon optimised gag-pol at all time points. 

20 

To further determine whether this enhanced expression that was observed with 
the codon optimised gene was due to better translation or due to effects on the 
RNA, RNA analysis was carried out. 

25 The efficiency of vector production using the codon optimised gag-pol gene 

To determine the effects of the codon optimised gag-pol on vector production, we 
used an HIV vector genome, pH6nZ and the VSV-G envelope expression 
plasmid pHCMVG (113), in combination with either pSYNGP, pSYNGP-RRE, 
30 pSYNGP-ERR or pGP-RRE3 as a source for the gag-pol in a plasmid ratio of 2 : 
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1 : 2 in a 3 plasmid co-transfection of 293T cells (49). Whole cell extracts and 
culture supernatants were evaluated by Western blot analysis for the presence of 
the gag and gag-pol gene products. Particle production was, as expected (Figure 
10), 5-10 fold higher for the codon optimised genes when compared to the wild 
5 type. 

To determine the effects of the codon optimised gag-pol gene on vector litres, 
several ratios of the vector components were used. The results are shown in 
Figure 21. Where the gag-pol was the limiting component in the system (as 
determined by the drop in titres observed with the wild type gene), titres were 10- 
fold higher for the codon optimised vectors. This is in agreement with the higher 
protein production observed for these vectors, but suggests that under normal 
conditions of vector production gag-pol is saturating and the codon optimisation 
gives no maximum yield advantage. 



10 



15 



20 



The effect of HTV-1 gag INS sequences on the codon optimised gene is 
position dependent 

It has previously been demonstrated that insertion of wild type HTV-1 gag 
sequences downstream of other RNAs, e.g. HTV-1 tat (56), HTV-1 gag (55) or 
CAT (57) can lead to a dramatic decrease in steady state mRNA levels, 
presumably as a result of the INS sequences. In other cases, e.g. for p-globin 
(58), it was shown that the effect was splice site dependent. Cellular AREs (AU- 
rich elements) that are found in the 3' UTR of labile mRNAs may confer mRNA 
destabilisation by inducing cytoplasmic deadenylation of the transcripts (59). To 
test whether HTV-1 gag INS sequences would destabilise the codon optimised 
RNA, the wild-type HTV-1 gag sequence, or parts of it (nt 1-625 or nt 625-1503), 
were amplified by PCR from the proviral clone pW13. All fragments were blunt 
ended and were inserted into pSYNGP or pSYNGP-RRE at either a blunted 
30 EcoRl or Notl site (upstream or downstream of the codon optimised gag-pol 
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gene repectively). As controls the wt HTV-1 gag in the reverse orientation (as 
INS sequences have been shown to act in an orientation dependent manner, (57) 
(pSYN7) and lacZ, excised from plasmid pCMV-pgal (CLONTECH) (in the 
correct orientation) (pSYNS) were also inserted in the same site. Contrary to our 
5 expectation, as shown in Figure 13, the wild type HTV-1 gag sequence did not 
appear to significantly affect RNA or protein levels of the codon optimised gene. 
We further constructed another series of plasmids (by PCR and from the same 
plasmids) where the wild type HTV-1 gag in the sense or reverse orientation, 
subfragments of gag (nt 1-625 or nt 625-1503), the wild type HTV-1 gag without 

10 the ATG or with a frameshift mutation 25 bases downstream of the ATG, or nt 
72-1093 of LacZ (excised from plasmid pH6Z), or the first 1093 bases of lacZ 
with or without the ATG were inserted upstream of the codon optimised HTV-1 
gag-pol gene in pSYNGP and/or pSYNGP-RRE (pSYN9-pSYN22, Figure 14). 
Northern blot analysis showed that insertion of the wild type HTV-1 gag gene 

15 upstream of the codon optimised HTV-1 gag-pol (pSYN9, pSYNIO) lead to 
diminished RNA levels in the presence or absence of Rev/RRE (Figure 15 A, 
lanes 1-4 and Figure 15B, lanes 1+12). The effect was not dependent on 
translation as insertion of a wild type HTV-1 gag lacking the ATG or with a 
frameshift mutation (pSYN12, pSYN13 and pSYN14) also diminished RNA 

20 levels (Figure 15B, lanes 1-7). Western blot analysis verified that there was no 
HTV-1 gag translation product for pSYN12-14. However, it is possible that, as the 
wt HTV-1 gag exhibits such an adverse codon usage, it may act as a non- 
translatable long 5' leader for syngp, and if this is the case, then the ATG 
mutation should not have any effects. 

25 

Insertion of smaller parts of the wild type HTV-1 gag gene (pSYN15 and 
pSYN17) also lead to a decrease in RNA levels (Figure 15B, lanes 1-3 and 8-9), 
but not to levels as low as when the whole gag sequence was used (lanes 1-3, 4-7 
and 8-9 in Figure 15B). This indicates that the effect of INS sequences is 
30 dependent on their size. Insertion of the wild type HTV-1 gag in the reverse 
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5 



orientation (pSYNl 1) had no effect on RNA levels (Figure 15A, lanes 1 and 5-6). 
However a splicing event seemed to take place in that case, as indicated by the 
size of the RNA (equal to the size of the codon optimised gag-pol RNA) and by 
the translation product (gag-pol, in equal amounts compared to pSYNGP, as 
verified by Western blot analysis). 



These data indicate therefore that wild type HTV-1 gag instability sequences act 
in a position and size dependent manner, probably irrespective of translation. It 
should also be noted that the RRE was unable to rescue the destabilised RNAs 
1 0 through interaction with Rev. 

Construction of an HIV-1 based vector system that lacks all the accessory 
proteins 

15 Until now several HTV-1 based vector systems have been reported that lack all 
accessory proteins but Rev (49, 60). We wished to investigate whether the codon 
optimised gene would permit the construction of an HTV-I based vector system that 
lacks all accessory proteins. We initially deleted rev/KRE and any residual env 
sequences, but kept the first 625 nucleotides of gag, as they have been shown to 
20 play a role in efficient packaging (61). Two vector genome constructs were made, 
P H6.1nZ (retaining only HIV sequences up to nt 625 of gag) and P H6.2nZ (same as 
pH6.1nZ, but also retaining the env splice acceptor). These were derived from a 
conventional HTV vector genome that contains RRE and expresses Rev (pH6nZ). 
Our 3-plasmid vector system now expressed only HTV-1 gag-pol and the VSV-G 
25 envelope proteins. Vector particle titres were determined as described in the 
previous section. A ratio of 2 : 2 : 1 of vector genome (pH6Z or P H6.1nZ or 
pH6.2nZ) : gag-pol expression vector (pGP-RRE3 or pSYNGP) : pHCMV-G was 
used. Transfections were performed in the presence or absence of pCMV-Rev, as 
gag-pol expression was still Rev dependent for the wild type gene. The results are 
30 summarised in Figure 22 and indicate that an HTV vector could be produced in the 
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total absence of Rev, but that maximum titres were compromised at 20-fold lower 
than could be achieved in the presence of Rev. As gag-pol expression should be the 
same for pSYNGP with pH6nZ or pH6.1nZ or pH6.2nZ (since it is Rev 
independent), as well as for pGP-RRE3 when Rev is provided in trans, we 
5 suspected that the vector genome retained a requirement for Rev and was therefore 
limiting the titres. To confirm this, Northern blot analysis was performed on 
cytoplasmic RNA prepared from cells transfected with pH6nZ or pH6.1nZ in the 
presence or absence of pCMV-Rev. As can be seen in Figure 17, lanes 1-4, the 
levels of cytoplasmic RNA derived from pH6nZ were 5-10 fold higher than those 
10 obtained with pH6.1nZ (compare lanes 1-2 to lanes 3-4). These data support the 
notion that RNA produced from the vector genome requires the Rev/RRE system to 
ensure high cytoplasmic levels. This may be due to inefficient nuclear export of the 
RNA, as INS sequences residing within gag were still present. 

15 Further deletions in the gag sequences of the vector genome might therefore be 
necessary to restore titres. To date efficient packaging has been reported to require 
360 (62) or 255 (63) nucleotides of gag in vectors that still retain ertv sequences, or 
about 40 nucleotides of gag in a particular combination of splice donor mutation, 
gag and env deletions (64, 63). In an attempt to remove the requirement for 

20 Rev/RRE in our vector genome without compromising efficient packaging we 
constructed a series of vectors derived from pH6nZ containing progressively larger 
deletions of HTVM sequences (only sequences upstream and within gag were 
retained) plus and minus a mutant major splice donor (SD) (GT to CA mutation). 
Vector particle titres were determined as before and the results are summarised in 

25 Figure 23. As can be seen, deletion of up to nt 360 in gag (vector pHS3nZ) resulted 
in an increase in titres (compared to pH6.1nZ or pH6.2nZ) and only a 5-fold 
decrease (titres were 1.3-1.7 x 10 5 ) compared to pH6nZ. Further deletions resulted 
in titres lower than pHS3nZ and similar to pH6.1nZ. In addition, the SD mutation 
did not have a positive effect on vector titres and in the case of pHS3nZ it resulted 

30 in a 10-fold decrease in titres (compare titres for pHS3nZ and pHS7nZ in Figure 



62 



WO 01/79518 



PCT/GBO 1/0 1784 



23). Northern blot analysis on cytoplasmic RNA (Figure 17, lanes 1 and 5-12) 
showed that RNA levels were indeed higher for pH6nZ, which could account for 
the maximum titres observed with this vector. RNA levels were equal for pHSlnZ 
(lane 5), pHS2nZ (lane 6) and pHS3nZ (lane 7) whereas titres were 5-8 fold higher 
5 for pHS3nZ. It is possible that further deletions (than that found in pHS3nZ) in gag 
might result in less efficient packaging (as for HIV-1 the packaging signal extends 
in gag) and therefore even though all 3 vectors produce similar amounts of RNA 
only pHS3nZ retains maximum packaging efficiency. It is also interesting to note 
that the SD mutation resulted in increased RNA levels in the cytoplasm (compare 

10 lanes 6 and 10, 7 and 11 or 8 and 12 in Figure 17) but equal or decreased titres 
(Figure 23). The GT dinucleotide that was mutated is in the stem of SL2 of the 
packaging signal (65). It has been reported that SL2 might not be very important for 
HTV-1 RNA encapsidation (65, 66), whereas SL3 is of great importance (67). 
Folding of the wild type and SD-mutant vector sequences with the RNAdraw 

15 software program revealed that the mutation alters significantly the secondary 
structure of the RNA and not only of SL2. It is likely therefore that although the SD 
mutation enhances cytoplasmic RNA levels it does not increase titres as it alters the 
secondary structure of the packaging signal. 

20 To investigate whether the titre differences that were observed with the Rev minus 
vectors were indeed due to Rev dependence of the genomes, the RRE sequence (nt 
7769 - 8021 of the HXB2 sequence) was inserted in the Spel site (downstream of 
the gag sequence and just upstream of the internal CMV promoter) of pH6.1nZ, 
pHSlnZ, pHS3nZ and pHS7nZ, resulting in plasmids pH6.1nZR, pHSlnZR, 

25 pHS3nZR and pHS7nZR respectively. Vector particle titres were determined with 
pSYNGP and pHCMVG in the presence or absence of Rev (pCMV-Rev) as before 
and the results are summarised in Figure 24. In the absence of Rev titres were 
further compromised for pH6.1nZR (7-fold compared to pH6.1nZ), pHS3nZR (6- 
fold compared to pHS3nZ) and pHS7nZR (2.5-fold compared to pHS7nZ). This 

30 was expected, as the RRE also acts as an instability sequence (68) and so it would 
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be expected to confer Rev-dependence. In the presence of Rev litres were restored 
to the maximum titres observed for pH6nZ in the case of pHS3nZR (5 x 10 5 ) and 
pH6.1nZR (2 x 10 5 ). Titres were not restored for pHSTnZR in the presence of Rev. 
This supports the hypothesis that the SD mutation in pHS7nZ affects the structure of 
5 the packaging signal and thus the packaging ability of this vector genome, as in this 
case Rev may be able to stimulate vector genome RNA levels, as for pHS3nZR and 
pH6.1nZR, but it can not affect the secondary structure of the packaging signal. For 
vector pHSlnZ inclusion of the RRE did not lead to a decrease in titres. This could 
be due to the fact that pHSlnZ contains only 40 nucleotides of gag sequences and 

10 therefore even with the RRE the size of instability sequences is not higher than for 
pHS2nZ that gives equal titres to pHSlnZ. Rev was able to partially restore titres 
for pHSlnZR (10-fold increase when compared to pHSlnZ and 8-fold lower than 
pH6nZ) but not fully as in the case of pHS3nZ. This is also in agreement with the 
hypothesis that 40 nucleotides of HIV- 1 gag sequences might not be sufficient for 

15 efficient vector RNA packaging and this could account for the partial and not 
complete restoration in titres observed with pHSlnZR in the presence of Rev. 

In addition, end-point titres were determined for pHS3nZ and pH6nZ with 
pSYNGP in HeLa and HT1080 human cell lines. In both cases titres followed the 

20 pattern observed in 293 T cells, with titres being 2-3 fold lower for pHS3nZ than 
for pH6nZ (See Figure 10). Finally, transduction efficiency of vector produced 
with pHS3nZ or pH6nZ and different amounts of pSYNGP or pGP-RRE3 at 
different m.o.i.'s (and as high as 1) was determined in HT1080 cells. This 
experiment was performed as the high level gag-pol expression from pSYNGP 

25 may result in interference by genome-empty particles at high vector 
concentrations. As expected for VSVG pseudotyped retroviral particles (69) 
transduction efficiencies correlated with the m.o.i.'s, whether high or low 
amounts of pSYNGP were used and with pH6nZ or pHS3nZ. For m.o.i. 1 
transduction efficiency was approximately 50-60% in all cases (Figure 1 8). The 
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above data indicate that no interference due to genome-empty particles is 
observed in this experimental system. 

The codon optimised gag-pol gene does not use the exportin-1 nuclear export 
5 pathway 

Rev mediates the export of unspliced and singly spliced HIV-1 mRNAs via the 
nuclear export receptor exportin-1 (CRM1) (70, 71, 72, 73, 74). Leptomycin B 
(LMB) has been shown to inhibit leucine-rich NES mediated nuclear export by 
10 disrupting the formation of the exportin-1 /NES/RanGTP complex (75, 72). In 
particular, LMB inhibits nucleo-cytoplasmic translocation of Rev and Rev- 
dependent HIV mRNAs (76). To investigate whether exportin-1 mediates the 
export of the codon optimised gag-pol constructs, the effect of LMB on protein 
production was tested. Western blot analysis was performed on cell lysates from 
15 cells transfected with the gag-pol constructs (+/- pCMV-Rev) and treated or not 
with LMB (7.5 nM, for 20 hours, beginning treatment 5 hours post-transfection). 
To confirm that LMB had no global effects on transport, the expression of 0-gal 
from the control plasmid pCMV-pGal was also measured. An actin internal 
control was used to account for protein variations between samples. The results 
20 are shown in Figure 16. As expected (76), the wild type gag-pol was not 
expressed in the presence of LMB (compare lanes 3 and 4), whereas LMB had no 
effect on protein production from the codon optimised gag-pol, irrespective of 
the presence of the RJRE in the transcript and the provision of Rev in trans 
(compare lanes 5 and 6, 7 and 8, 9 and 10, 11 and 12, 5-6 and 11-12). The 
25 resistance of the expression of the codon-optimised gag-pol to inhibition by LMB 
indicates that the exportin-1 pathway is not used and therefore an alternative 
export pathway must be used. This offers a possible explanation for the Rev 
independent expression. The fact that the presence of a nonfunctional Rev/RRE 
interaction did not affect expression implies that the RRE does not necessarily act 
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as an inhibitory (e.g. nuclear retention) signal per se, which is in agreement with 
previous observations (5, 58). 

In conclusion, this is the first report of an HTV-1 based vector system, composed 
5 of pSYNGP, pHS3nZ and pHCMVG, where significant vector production can be 
achieved in the absence of all accessory proteins. These data indicate that in order 
to achieve maximum titres the HTV vector genome must be configured to retain 
efficient packaging and that this requires the retention of gag sequences and a 
splice donor. By reducing the gag sequence to 360 nt in pHS3nZ and combining 
10 this with pSYNGP it is possible to achieve titre of at least 10 5 LU./ml that is only 
5-fold lower than the maximum levels achieved in the presence of Rev. 

Example 2 -EIAV 

Codon-optimised EIAV gag-pol expression cassettes 

15 

We also examined if the codon-optimisation process would alter the properties of 
the gag-pol gene of the non-primate lentivirus EIAV. The sequence is of the codon- 
optimised gene is shown from ntll03 to 5760 of SEQ ID NO:5 (Figure 9). The 
wild type and the codon-optimised sequences are denoted WT and CO, respectively. 

20 The codon usage was changed to that of highly expressed mammalian genes. 
pESYNGP (Figure 27 and SEQ ID NO:5) was made by transferring an Xbal-Notl 
fragment from a plasmid containing a codon-optimised EIAV gag/pol gene, 
synthesised by Operon Technologies Inc., Alameda, CA, into pCIneo (Promega). 
The gene was supplied in a proprietary plasmid backbone, GeneOp. The 

25 fragment transferred to pCIneo includes sequences flanking the codon-optimised 
EIAV gag/pol ORF: tctagaGAATTCGCCACCATG- EIAV gag/pol- 
TGAACCCGGGgcggccgc. The ATG start and TGA stop codons are shown in 
bold and the recognition sequences formal and Notl sites in lower case. 
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The expression of Gag/Pol from the codon-optimised gene was assessed with 
respect to that from various wild type EIAV gag/pol expression constructs by 
transient transfection of HEK 293T cells (Figure 25). Transfections were carried 
out using the calcium phosphate technique, using equal moles of each Gag/Pol 
5 expression plasmid together with a plasmid which expressed EIAV Rev either 
from the wild type sequence or from a codon-optimised version of the gene: 
pCIneoEREV (WO 99/32646) (Figure 35 and SEQ ID NO:13) or pESYNREV 
(Figure 36 and SEQ ID NO: 14), respectively. pESYNREV is a pCIneo-based 
plasmid (Promega) which was made by introducing the EcoRI to Sail fragment 

10 from a synthetic EIAV REV plasmid, made by Operon Technologies Alameda, 
CA. The plasmid backbone was the proprietary plasmid GeneOp in which was 
inserted a codon-optimised EIAV REV gene flanked by EcoRI and Sail 
recognition sequences and a Kozak consensus sequence to drive efficient 
translation of the gene. The mass of DNA on each transfection was equalised by 

15 addition of pCIneo plasmid. In transfections in which a Rev expression plasmid 
was omitted, a similar mass of pCIneo (Promega) was used instead (lanes 
labelled pCIneo). Cytoplasmic extracts were prepared 48 hours post transfection 
and 15 jag amounts of protein were fractionated by SDS-PAGE and then 
transferred to Hybond ECL. The Western blot was probed with a polyclonal 

20 antisera from an EIAV-infected horse and then with a secondary antibody, anti- 
horse horse-radish peroxidase conjugate. Development of the blot was carried 
out using the ECL kit (Amersham). Positive controls for the blotting and 
development procedure, and cytoplasmic extract from untransfected HEK 293T 
cells are as indicated. The positions of various EIAV proteins are indicated. 

25 

Expression from wild type gag/pol was achieved from various plasmids (see 
Figure 25). pONY3.2T is a derivative of pONY3.1 (W099/32646)(Figure 34 
and SEQ ID NO: 12) in which mutations which ablate expression of Tat and S2 
have been made. In addition, the EIAV sequence is truncated downstream of the 
30 second exon of rev. Specifically, expression of Tat is ablated by an 83nt deletion 
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in exon 2 of tat which corresponds with respect to the wild type EIAV sequence, 
Acc. No. U01866, to deletion of nt 5234-5316 inclusive. S2 ORF expression is 
ablated by a 51nt deletion, corresponding to nt 5346-5396 of Acc. No. U01866. 
The EIAV sequence is deleted downstream of a position corresponding to nt 
5 7815 of Acc. No. U01866. These alterations do not alter rev, hence expression of 
this gene is expressed as for pONY3.1. pONY3.2 OPTI is a derivative of 
pONY3.1 which has the same deletions for ablation of Tat and S2 expression as 
described above. In addition, the first 372nt of gag have been 'codon-optimised' 
for expression in human cells. The sequence of the wild type and codon- 

10 optimised sequences present in pONY3.20PTI in this region are compared in 
Figure 43. Base differences between the sequences are indicated. The region 
which was codon-optimised represents the region of overlap between the vector 
and wild-type gag/pol expression constructs. Reduction of homology within this 
region would be expected to improve the safety profile of the vector system due 

15 to the reduced chances of recombination between the vector genome and the 
gag/pol transcripts. 3.2 OPTI-Ihyg is a derivative of 3.2 OPTI in which the 
SnaBI-Nofl fragment of 3.2 OPTI is transferred to pIRESlhygro (Clontech) 
prepared for ligation by digestion with the same sites. The gag/pol gene is thus 
placed upstream of the IRES hygromycin phosphotransferase. Of note is the fact 

20 that the resulting construct contains the intron from pCIneo, not from 
pIRESlhygro. pEV53B is a derivative of PEV53A (WO 98/51810) in which the 
EIAV-derived sequence upstream of the Gag initiation codon is reduced to 
include only the major splice donor and surrounding seqeunces: 
CAG/GTAAGATG, where the Gag initiation codon is shown in bold face. 

25 

The results (Figure 26) shown the Rev-dependence of Gag/Pol expression from 
pHORSE3.1 (WO 99/32646), which has an EIAV derived leader sequence 
starting just downstream of the primer binding site and an RRE placed 
downstream of gag/pol composed of the two EIAV sequences reported to have 
30 RRE activity. Expression was enhanced by the same amount when Rev 
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expression was driven by wild type (pCIneoERev) (Figure 35) or codon- 
optimised (pESYNREV) (Figure 36) genes. This result confirms the 
functionality of the codon-optimised Rev expression plasmid. 

5 In contrast to expression of Gag/Pol from pONY3.1, expression from pESYNGP 
was not influenced by the presence of Rev, however it was slightly lower than 
from pONY3.1 or pON3.2T. Expression from pESYNGPRRE (Figure 30 and 
SEQ ID NO:7), in which the EIAV RRE sequence present in pHORSE3.1 is 
placed downstream of gag/pol, appeared slightly lower than from pESYNGP. 

10 The levels of expression from 3.2 OPTI and 3.20PTI-Ihyg were significantly 
lower than from pESYNGP or pONY3.1, even in the presence of Rev. This 
result suggested that there may be determinants of Gag/Pol expression within the 
first 372nt of the gag and showed that 3.2 OPTI was unlikely to be useful as a 
basis for EIAV vector production. Furthermore it demonstrates that codon- 

15 optimisation of only certain regions of the whole gag/pol gene may not lead to 
high levels of Rev-independent expression. 

We have previously demonstrated (43) that the 5' leader (121 bp upstream of the 
ATG start codon) and the RRE sequence (43) are important for high expression of 

20 the wild type EIAV gag-pol. Three constructs were made that contained either the 
leader sequence (LpESYNGP), the leader and RRE sequences (LpESYNGPRRE) 
or the RRE sequence (pESYNGPRRE). The sequences of these constructs are 
shown in SEQ ID NOS:6-8 and Figures 28-30. They were transfected into 293T 
cells in either the presence or absence of Rev expression plasmid. The cell 

25 supernatant was then measured for reverse transcriptase activity (RT), using a 
conventional RT assay, to evaluate which construct generated the highest amount of 
gag-pol mRNA. The results are shown in Figures 39 and 40. It is clear from these 
results that the 5' leader leads to an increase in RT activity. The ability of these 
Gag/Pol expression constructs to support formation of infectious vector particles 

30 was also tested by transient transfection of HEK 293 cells. The results of this 
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analysis of show that all of the constructs could provide functional EIAV Gag/Pol, 
and show the Rev dependence of titre with the pONY8.0Z vector genome plasmid, 
which does not encode any EIAV proteins (Figure 41). 

5 The ability of pESYNGP to act in concert with a minimal EIAV vector genome 
plasmid pONY8.1Z (Figure 33, SEQ ID NO:ll) was evaluated (Figure 42). The 
result shows that the titres obtained with pESYNGP and pONY8.1Z are about 10- 
fold lower than from pONY3.1 and pONY8.1Z. This reduced titre reflects the lack 
of Rev protein in the system rather than a deficiency of Gag/Pol production which 
10 we have already shown is independent of Rev expression. 

Expression of EIAV Gag/Pol was also tested from pESDSYNGP (Figure 50 and 
SEQ ID NO: 18) in which the Kozak consensus sequence of Gag is replaced by 
the natural EIAV splice donor. pESDSYNGP was made from pESYNGP by 

15 exchange of the 306bp EcoKL-Nhel fragment, which runs from just upstream of 
the start codon for gag/pol to approximately 300 base pairs inside the gag/pol 
ORF with a 308bp EcoKL-Nhel fragment derived by digestion of a PCR product 
made using pESYNGP as template and using the following primers: SD FOR 
[GGCTAGAGAATTCCAGGTAAGATGGGCGATCCCCTCACCTGG] and SD 

20 REV [TTGGGTACTCCTCGCTAGGTTC]. This manipulation replaces the 
Kozak concensus sequence upstream of the ATG in pESYNGP with the splice 
donor found in EIAV. The sequence between the EcoKL site and the ATG of 
gag/pol is thus CAGGTAAG, exactly as found in the natural viral sequence. 
Therefore the mRNA is deleted with respect to sequences upstream but not 

25 downstream of the splice donor. The performance of pESDSYNGP was assessed 
relative to pESYNGP and other expression plasmids by measurement of reverse 
transcriptase activity in supernatants from transiently transfected HEK 293T cells 
using a Taqman-based version of the product enhanced reverse transcriptase 
(PERT) assay. In this method, reverse transcriptase associated with vector 

30 particles is released by mild detergent treatment and used to synthesize cDNA 
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using MS2 bacteriophage RNA as template. MS2 RNA template and primer are 
present in excess hence the amount of cDNA is proportional to the amount of RT 
released from the particles. Therefore, the amount of cDNA synthesised is 
proportional to the number of particles. MS2 cDNA is then quantitated using 
5 Taqman technology. The assay is carried out on test samples in parallel with a 
vector stock of known titre and estimated particle content. The use of the 
standard allows creation of a 'standard curve' and allows the relative RT content 
of various samples to be calculated. The results of this analysis are shown in 
Figure 49. The results show that Gag/Pol expression is virtually identical from 

10 pESYNGP and pESDSYNGP. The results also indicate that expression is not 
significantly enhanced by Rev. The activity of the Rev expression plasmid is 
confirmed by the result obtained with pHORSE +, in which there is an RRE 
downstream of the wild type ELAV gag/pol, and that shows a 6-fold enhancement 
of expression in the presence of Rev. We also noted that the expression from 

15 pHORSE was enhanced 3 -fold in the presence of Rev. Since this construct has 
no RRE it suggests that Rev may be having a non-specific enhancing effect on 
expression, . possibly as a result of being expressed at high levels in this 
experimental system. 

20 The ability of pESYNGP to participate in the formation of infectious viral vector 
particles, when co-transfected with plasmids for the vector genome and envelope 
was assessed by transient transfection of HEK 293T, as described previously (49, 
50). Briefly, 293T cells were seeded on 6cm dishes (1.2 x 10 6 /dish) and 24 hours 
later they were transfected by the calcium phosphate procedure. The medium 

25 was replaced 12 hours post-transfection and supernatants were harvested 48 
hours post-transfection, filtered (0.45 j^m filters) and titered by transduction of 
D17, canine osteosarcoma cells,, in the presence of 8 jag/ml Polybrene (Sigma). 
Cells were seeded at 0.9 x 10 5 /well in 12 well plates 24 hours prior to use in 
titration assays. Dilutions of supernatant were made in complete media 

30 (DMEM/10%FBS) and 0.5ml aliquots plated out onto the D17 cells. 4 hours 
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after addition of the vector the media was supplemented with a further 1ml of 
media. Transduction was assessed by X-gal staining of cells 48 hours after 
addition of viral dilutions. 

5 The vector genomes used for these experiments were pONY4.0Z (Figure 31 and 
SEQ ID NO:9) and pONY8.0Z (Figure 32 and SEQ ID NO: 10). 

pONY4.0Z (WO 99/32646) was derived from pONY2.1 1Z by replacement of the 
U3 region in the 5'LTR with the cytomegalovirus immediate early promoter 

10 (pCMV). This was carried out in such a way that the first base of the transcript 
derived from this CMV promoter corresponds to the first base of the R region. 
This manipulation results in the production of high levels of vector genome in 
transduced cells, particularly HEK 293T cells, and has been described previously 
(50). pONY4.0Z expresses all EIAV proteins except for envelope, expression of 

15 which is ablated by a deletion of 736nt between the HindHl sites present in em?. 

pONY8.0Z was derived from pONY4.0Z by introducing mutations which 1) 
prevented expression of TAT by an 83nt deletion in the exon 2 of tat ) prevented 
S2 ORF expression by a 51nt deletion 3) prevented REV expression by deletion 

20 of a single base within exon 1 of rev and 4) prevented expression of the N- 
terminal portion of gag by insertion of T in ATG start codons, thereby changing 
the sequence to ATTG from ATG. With respect to the wild type EIAV sequence 
Acc. No. U01 866 these correspond to deletion of nt 5234-5316 inclusive, nt 
5346-5396 inclusive and nt 5538. The insertion of T residues was after nt 526 

25 and 543. 

The results of this analysis are shown tabulated in Figure 37, and graphically in 
Figure 38. Transfections were carried out with only 3 plasmids (vector genome, 
gag/pol expression plasmid and VSV-G expression plasmid) - diagonal lined 
30 bars, or with four plasmids, which included the previous set of plasmids together 
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with an additional plasmid encoding Rev or a similar plasmid not coding a 
functional protein - filled bars. The result show that high titres of vector can be 
achieved using pESYNGP to supply EIAV Gag/Pol. The highest titres were 
obtained using the Rev-expressing vector genome plasmid, pONY4.0Z, and they 
5 were only slightly lower than observed when Gag/Pol was supplied by pONY3. 1 . 
Lower titres were observed with pONY8.0Z vector genome plasmid with 
pESYNGP than with pONY3.1. This is due to the Rev expression requirement of 
pONY8.0Z. Rev is expressed by pONY3.1, but not pESYNGP. These results 
confirm the utility of the codon-optimised Gag/Pol expression plasmid. 

10 

Use of the synthetic EIAV gag/pol gene in construction of cell lines which 
stably express EIAV gag/pol. 

Cells lines which express high amounts of EIAV Gag/pol are required for the 
15 construction of packaging and producer cells for EIAV vectors. As a first step in 
their construction HEK 293 cells were stably transfected with pIRESlhyg 
ESYNGP (Figure 44 and SEQ ID NO:17), in which EIAV Gag/pol expression is 
driven by a CMV promoter, and is linked to an ORF for expression of 
hygromycin phosphotransferase by an EMCV IRES. pIRESlhyg ESYNGP was 
20 made as follows. The synthetic EIAV gag/pol gene and flanking sequences was 
transferred from pESYNGP into pIRESlhygro expression vector (Clontech). 
First, pESYNGP was digested with ZfcoRI, and the ends filled in by treatment 
with T4DNA polymerase and then digested with Noil. pIRESlhygro was 
prepared for ligation with this fragment by digestion with Nsil, the ends trimmed 
25 flush by treatment with T4 DNA polymerase, then digested with Noil. Prior to 
transfection into HEK 293 cells pIRESlhyg ESYNGP was digested with Ahdl 
which linearises the plasmid. 

Clonal cell lines were derived by serial dilution and analysed for expression of 
30 Gag/Pol by a Taqman-based product enhanced reverse transcriptase (PERT) 
assay. Data for the cell line Q3.29, which expressed the highest level of Gag/Pol 
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is shown. The analysis showed that the level of expression from the codon- 
optimised EIAV Gag/Pol cassette in Q3.29 was very similar to that seen for an 
EIAV producer line, 8Z.20, in which Gag/Pol is expressed from the pEV53B 
wild type expression cassette, that produced vector particles at titres of almost 10 6 
5 transducing units per ml. (Figure 45). Assuming exponential amplification 
during the assay, a difference of Ct value of 1.0 corresponds to a difference of 2- 
fold in concentration of the reverse transcriptase released from the particles. 
Therefore the difference in Gag/Pol expression between Q3.29 and 8Z.20 cells is 
approximately 2-8 fold. Furthermore the Ct values observed indicate that the 

10 level of expression of Gag/Pol is significantly higher than in samples of 
pONY8G vector particles with a titre of 2 x 10 6 transducing units per ml on D17 
cells, but made by transient transfection of HEK 293T cells. These data indicate 
that the codon-optimised EIAV Gag/Pol construct can be used in the construction 
of EIAV packaging and producer lines and confirms the previous result that 

1 5 expression is independent of Rev expression. 

The Q3.29 cell line was then tested for its ability to support production of 
infectious vector particles when transfected with a vector genome plasmid, 
pONY8.0Z, and the VSV-G envelope expression plasmid, pRV67 and the EIAV 

20 REV expression plasmid, pESYNREV. In addition we also evaluated the 
performance of a plasmid pONY8.3G FB29 (-) which is modified form of the 
pONY8G vector genome plasmid. PONY8G is a standard EIAV vector genome 
used for comparison purposes. The modifications and construction of 
pONY8.3G FB29 (-) (SEQ ID NO: 19) are described in PCT/GBOO/03837 and 

25 briefly are 1) the introduction of loxP recognition sites upstream and downstream 
of the vector genome cassette 2) the placement of an expression cassette for 
codon-optimised REV, derived from pESYNREV, and driven by the FB29 U3 
promoter downstream of the vector genome cassette and orientated so that the 
direction of transcription was towards the vector genome cassette. The REV 

30 expression cassette is located upstream of the 3' loxP site. Thus the pONY8.3G 
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FB29 - plasmid carries expression cassettes for the vector genome RNA and for 
EIAV Rev. 

The titres were established by limiting dilution on D17 canine osteosarcoma cells 
5 and are shown in Figure 46 

The titres obtained from transfections 2-6 were up to 4.5 x 10 6 transducing units 
per ml indicating levels of Gag/Pol expression sufficient to support titres at least 
this high. The titres obtained were not higher when additional Gag/Pol was 
10 supplied (transfection 1) indicating that Gag/Pol expression was not the limitation 
on titre. 

Improved safety profile due to Gag/Pol expression from a codon-optimised 
expression construct 

15 

RCR formation takes place by recombination between different components of 
the vector system or by recombination of vector system components with 
nucleotide sequences present in the producer cells. Although recombination at 
the DNA level during construction of producer cell lines is possible (perhaps 

20 leading to insertional activation of endogenous retroelements or retroviruses) it is 
thought that recombination to produce RCR occurs mainly between RNA's 
undergoing reverse transcription, hence occurs within the mature vector particles. 
In consequence, recombination will be more likely to occur between RNA's 
which contain packaging signals, such as the vector genome and the gag/pol 

25 mRNA. Usually however the gag/pol transcript is modified so that it is deleted 
with respect to some or all defined packaging elements, thereby reducing the 
chances of its involvement in recombination. 

The codon-optimisation process used to create the HIV and EIAV Gag/Pol 
30 expression plasmid, pSYNGP and pESYNGP, also results in disruption of 
sequences and structures that direct packaging as a result of introducing changes 
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at approximately every 3 r nucleotide position. We have obtained evidence for 
the lower level of incorporation of the codon-optimised RNA derived from 
pESYNGP into virions. 

5 The packaging of mRNA's derived from a wild type gag/pol pEV53B expression 
cassette, and from the codon-optimised EIAV gag/pol expression cassette, 
pESYNGP, was compared. Medium was collected from a HEK 293 based cell- 
lines which were stably transfected with either pEV53B (cell line B-241), or with 
pESYNGP. Both cell lines produce vector particles which do not contain vector 

10 RNA and do not have envelopes. In some experiments, an EIAV vector genome 
plasmid (pECG3-CZW) was transfected into the cells to serve as an internal 
positive control for hybridisation and for the presence of particles capable of 
packaging RNA. pECG3-CZW is a derivative of pEC-LacZ (WO 98/51810) and 
was made from the latter by 1) reduction of gag sequences so that only the first 

15 200nt of gag, rather than the first 577nt, was included and 2) inclusion of the 
woodchuck hepatitis virus post-transcriptional regulatory element (WHV PRE) 
(corresponding to nt 901-1800 of Acc. No. J04514) into the Notl site downstream 
of the LacZ reporter gene. 

20 Viral particles derived from each of the cell lines were then partially purified 
from the medium by equilibrium density gradient centrifugation. To do this 10 
ml of medium from producer cells, harvested at 24 hours after induction with 
sodium butyrate, was layered onto a 20-60% (w/w) sucrose gradient in TNE 
.buffer (pH 7.4) and centrifuged for 24 hours at 25,000 rpm and 4° C in a SW28 

25 rotor. Fractions were collected from the bottom and 10 jil of each fraction 
assayed for reverse transcriptase activity to locate viral particles. The results of 
this analysis are shown in (Figure 47) where the profile of reverse transcriptase 
activity is shown as a function of gradient fraction. In these figures, the top of 
the gradient is on the right. It should be noted that the levels of RT activity from 

30 the pESYNGP-expressing cell were significantly lower than from pEV53B 
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expressing cells. To determine the RNA content of the purified virions, aliquots 
from the top, middle or bottom fractions were pooled (as indicated by the bars 
labeled T, M and B) and the RNA from each fraction was subjected to slot-blot 
hybridization analysis. Using a probe specific for a common region of wild type 
5 and synthetic gag/pol, encapsidation of RNA was easily detectable in the peak 
fractions (M) of virions synthesized from the wild type construct (pEV53B), but 
was not detected from virions synthesized from the synthetic Gag/Pol construct 
(pESYNGP)(Figure 48). The control for the presence of capsid capable of 
carrying out encapsidation was the EIAV G3-CZW vector genome which was 

10 readily detected in peak fractions from cells expressing either the wild type or 
synthetic gag/pol proteins. Even taking into account the different levels of 
expression from the wild type and synthetic Gag/Pol expression constructs this 
result indicates that the RNA from the codon-optimised gag/pol gene is packaged 
significantly less efficiently than the wild type gene and represents a significant 

15 improvement to the safety profile of the system. Of further note is that the RNA 
transcribed from pEV53B was packaged. This RNA is deleted with respect to 
sequences upstream of the splice donor sequence (CAG/GTAAG) and yet was 
still packaged. This points to the localisation of major packaging determinants 
within the gag coding region and is in contrast to the collected observations on 

20 the location of the packaging signal of HIV-1 . 

In additional experiments we have shown that the packaging of transcripts from 
pEV53B is only slightly lower than from pEV53A (Figure 51). This indicates 
further that major packaging sequences are located within the gag coding region. 

25 In these experiments cell line B-241 expressed pEV53B RNA and PEV-17 
expressed pEV53A RNA. The EIAV vector genome used to confirm the 
presence of packaging competent, vector particles was G3-CZR, which is the 
same as G3-CZW, described above, except for the replacement of the woodchuck 
post-transcriptional regulatory element with a sequence containing the EIAV 

30 RRE elements. Methodology was as described above. 
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All publications mentioned in the above specification are herein incorporated by 
reference. Various modifications and variations of the described methods and 
system of the invention will be apparent to those skilled in the art without departing 
5 from the scope and spirit of the invention. Although the invention has been 
described in connection with specific preferred embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such 
specific embodiments. Indeed 3 various modifications of the described modes for 
carrying out the invention which are obvious to those skilled in molecular biology 
1 0 or related fields are intended to be within the scope of the following claims. 
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CLAIMS 

1 . Use of a nucleotide sequence coding for retroviral gag and pol proteins 
capable of assembly of a retroviral vector genome into a retroviral particle in a 
producer cell to generate a replication defective retrovirus in a target cell, 

5 wherein the nucleotide sequence is codon optimised for expression in the 
producer cell. 

2. Use of a nucleotide sequence coding for retroviral gag and pol proteins 
capable of assembly of a retroviral vector genome into a retroviral particle in a 
producer cell to prevent packaging of the retroviral vector genome in a target cell, 

10 wherein the nucleotide sequence is codon optimised for expression in the 
producer cell. 

3. Use of a nucleotide sequence coding for retroviral gag and pol proteins 
capable of assembly of a retroviral vector genome comprising at least part of a 

15 gag nucleotide sequence into a retroviral particle in a producer cell to prevent 
recombination between said nucleotide sequence coding for retroviral gag and 
pol proteins and the at least part of a gag nucleotide sequence, wherein the 
nucleotide sequence coding for retroviral gag and pol proteins is codon optimised 
for expression in the producer cell. 

20 

4. A use according to any preceding claim wherein the retroviral genome 
further comprises a nucleotide of interest (NOI). 

5. A use according to any preceding claim wherein the retroviral particle is a 
lentiviral particle. 

6. A use according to claim 5, wherein the retroviral particle is substantially 
derived from HTV-1 . 
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7. A use according to claim 6, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 15. 

8. A use according to claim 5, wherein the retroviral particle is substantially 
derived from EIAV. / 

9. A use according to claim 8, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 16. 

c 10. A method of producing a replication defective retrovirus comprising 
transfecting a producer cell with the following: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retroviral gag and pol proteins; 
and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

11. A method of preventing packaging of a retroviral genome in a target cell 
5 comprising the steps of: 

a. transfecting a producer cell with the following to produce 
retroviral particles: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retroviral gag and pol proteins; 

and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by one or more of the nucleotide 
sequences of (ii); and 

b. transfecting a target cell with retroviral particles of step (a); 
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characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

12. A method to prevent recombination between a retroviral vector genome 
and a nucleotide sequence encoding a viral polypeptide required for the assembly 
of the viral genome into retroviral particles comprising transfecting a producer 
cell with the following: 

(i) a retroviral genome comprising at least part of a gag nucleotide 
sequence; 

(ii) a nucleotide sequence coding for retroviral gag and pol proteins; 

and 

(iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

13. A method according to any one of claims 10 to 12 wherein the retroviral 
genome further comprises a nucleotide of interest (NOI). 

14. A method according to any one of claims 10 to 13 wherein the retroviral 
particle is a lentiviral particle. 

15. A method according to claim 14, wherein the retroviral particle is 
substantially derived from HIV-1. 

16. A method according to claim 15, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 15. 

17. A method according to claim 14, wherein the retroviral particle is 
substantially derived from EIAV. 
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18. A method according to claim 17, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 16. 

19. A method according to any one of claims 10 to 18, wherein iii) comprises 
a nucleotide sequence coding for an env protein. 

20. A nucleotide sequence coding for retroviral gag and pol proteins having 
the sequence of SEQ. ID. No. 15 or 16. 

21. A method according to any one of claims 10 to 20 wherein at least one of 
i) to iii) contains one or more functional accessory genes. 

22. A method according to any one of claims 10 to 20 wherein i) to iii) are 
devoid of any functional accessory genes. 

23 . A viral vector system comprising: 

i) a nucleotide sequence of interest; and 

ii) a nucleotide sequence encoding a viral polypeptide required for 
the assembly of viral particles wherein the nucleotide sequence is as defined in 
claim 20. 

24. A viral production system comprising: 

i) a viral genome comprising at least one nucleotide sequence of 
interest; and 

ii) a nucleotide sequence encoding a viral polypeptide required for 
the assembly of the viral genome into viral particles wherein the nucleotide 
sequence is as defined in claim 20. 
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25. A system according to claim 23 or claim 24 wherein the viral vector is a 
retroviral vector. 

26. A system according to claim 25 wherein the retroviral vector is a 
lentiviral vector. 

27. A system according to any one of claims 23 to 26 wherein the lentiviral 
vector is substantially derived from HTV-1 or EIAV. 

28. A system according to any of claims 23 to 27 wherein the nucleotide 
sequence defined i-ii) also includes an envelope protein. 

29. A system according to claim 28 wherein the envelope gene is codon 
optimised. 

30. A system according to any of claims 23 to 29 wherein the nucleotide of 
interest is selected from a therapeutic gene, a marker gene and a selection gene. 

31. A sysytem according to any one of claims 23 to 30 comprising one or 
more functional accessory genes. 

32. A system according to any of claims 23 to 30 devoid of any functional 
accessory genes. 

33. A viral system according to any one of claims 23 to 32 for use in a 
method of producing viral particles. 

34. A method for producing a viral particle which method comprises 
introducing into a producer cell: 

i) a viral genome as defined in any one of claims 24 to 33, 
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ii) one or more nucleotide sequences as defined in claim 20 and, 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by one or more of the nucleotide sequences of 
(ii). 

35. A viral particle produced by the production system of any one of claims 
24 to 33 or by the method of claim 34. 

36. A viral system according to any one of claims 23 to 33, or a viral particle 
according to claim 35, for treating a viral infection. 

37. A pharmaceutical composition comprising the viral system of any one of 
claims 23 to 33, or the viral particle of claim 35, together with a pharmaceutically 
acceptable carrier or diluent. 
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FIGURE 1 
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FIGURE 2 



gagpol-HXB2 -> Codon Usage 

DNA sequence 43 08 b.p. ATGGGTGCGAGA ... GATGAGGATTAG linear 
143 6 co dons 



MW : 161929 Dal ton CAI(S.c) : 0.083 CAI(E.c) : 0.151 
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FIGURE 3a 

gagpol-SYNgp [1 to 4308] -> Codon Usage 

DNA sequence 4308 b.p. ATGGGCGCCCGC ... GATGAGGATTAG. linear 
143 6 codons 



MW : 161929 Dalton CAI(S.c) : 0.080 CAI(E.c) : 0.296 
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Codon usage in human genes (MH), wild type HIV-1 Gag-pol 
(WT) and the codon optimised HIV-1 Gag-pol (CO) 
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FIGURE 4 



env-mn [1 to 2571] -> Codon Usage 

DNA sequence 2S71 b.p. ATGAGAGTGAAG ... GCTTTGCTATAA linear 
857 codons 



MW : 97078 Dalton CAI(S.c) : 0.083 CAI(E.c) : 0.14O 
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FIGURE 5 

SYNgplSOron -> Co don Usage 

DNA sequence 2571 b. p. ATGAGGGTGAAG . . . GCGCTGCTGTAA linear 



857 codons 



MW : 


97078 Dalton 


CAI { S . 


• c.) 


: 0.074 CAKE.c 


.) : 0.419 
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FIGURE 6 

HIV Constructs 
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ATGGGAGAC CCTTTGACATGGAGCAAGGCG CTCAAGAAGTTAGAGAAGGTGACGGTACAA 60 
ATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGGAAAAAGTCACCGTTCAG 6 0 



GGGTCTCAGAAATTAACTACTGGTAACTGTAATTGGGCGCTAAGTCTAGTAGACTTATTT 120 
GGTAGC CAAAAG C TTAC CACAGG CAATTG CAACTGGGCATTGTCCCTGG TGG ATCTTTTC 120 



CATGATACC^ACTTTGTAAAAGAAAAGGACTGGCAGCTGAGGGATGTCATTCCATTGCTG 180 
CACGACACTAATTTCGTTAAGGAGAAAGATTGGCAACTCAGAGACGTGATCCCCCTCTTG 180 



GAAGATGTAACTCAGACGCTGTCAGGACAAGAAAGAGAGGCCTTTGAAAGAACATGGTGG 24 0 
GAGGACGTH3ACCCAAACATTGTCTGGGCAGGAGCGCGAAGCTTTCGAGCGCACCTGGTGG 240 



GCAATTTCTGCTGTAAAGATGGGCCTCCAGATTAATAATGTAGTAGATGGAAAGGCATCA 300 
GCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGGTTGACGGTAAAGCTAGC 300 



TTC CAGCTCCTAAGAGCGAAATATGAAAAGAAGACTG CTAATAAAAAGCAGTCTGAGCCC 360 
TTT CAACTG CT C CG C G CTAAG TACGAGAAG AAAAC CG C CAACAAGAAACAAT C CGAACCT 360 



TCTGAAGAATATCGAATCATGATAGATGGGGCTGGAAACAGA7\ATTTTAGACC^CTAACA 420 
AGCGAGGAGTACCCAATTATGATCGACGG CGCCGG CAATAGGAACTT CCGCC QACTGACT 420 



CCTAGAGGATATACTACTTGGGTCAATACCATACAGACZAAATGGTCTATTAAATGAAGCT 480 
CCCAGGGGCTATACCACCTGGGTCAACACCATCCAGACAAACGGACTTTTGAACGAAGCC 480 



AGTC^U^CTTATTTGGGATATTATCAGTAGACTGTACTTCTGAAGAAATGAATGCATTT 540 
TCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCGAAGAAATGAATGCTTTT 540 



TTGGATGTGGTACCTGGCCAGGCAGGACAAAAGCAGATATTACTTGATGCAATTGATAAG 600 
CTCGACGTGGTGCCAGGACAGGCTGGACAGAAA(^GATCCTGCTCGATGCCATTGACAAG 600 



ATAGCAGATGATTGGGATAATAGACATCCATTACCGAATGCTCCACTGGTGGCACCACCA 660 
ATCGC^ACGACTGGGATAATCGCCACCCCCTGCC^^CGCCCCTCrGGTGGCTCCCCCA 660 



CAAGGG C C TATTC CCATG ACAG CAAGG TTTATTAGAGG TTTAGG AGTAC CTAGAGAAAGA 72 0 
CAGGGGCCTATC CCTATGAC CG CTAGGTTCATTAGGGGACTGGGGGTGCCC CGCGAACGC 720 



CAGATGGAGCCTGCTTTTGATCAGTTTAGGCAGACATATAGACAATGGATAATAGAAGCC 780 
CAGATGGAGCCAG CAT TTGACCAATT TAGGCAGACCTACAGACAGTGGATCATCGAAGCC 780 



ATGTCAGAAGGCATC^AAGTGATGATTGGAAAACCTAAAGCTCAAAATATTAGGCAAGKSA 840 
ATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCACAGAACATCAGGCAGGGG 84 0 • 



G CTAAGGAACCTTAC CCAGAATTTGTAGACAGACTATTAT CCCAAATAAAAAGTGAGGGA 900 
GCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCTGTCCCAGATTAAATCCGAAGGC 900 



CATCCACAAGAGATTTCAAAATTGTTGACTGATACACTGACTATTCAGAACGCZAAATGAG 960 
CACCCTC^GGAGATCTCCAAGTTCTTGACAGACACACTGACTATCCAAAATGGAAATGAA 960 
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WT GAATGTAGAAATGCrrATGAGACATTTAAGACCAGAGGATACATTAGAAGAGAAAATGTAT 1020 

CO GAGTGCAGAAACG CCA TG AGG CACCTCAGACCIX1AAGATACCCTGGAG GAGAAAATGTAC " 1020 

WT G CTTG CAGAGACATTGG AACTACAAAACAAAAG ATGATGTTATTG GCAAAAG CACTTCAG 1080 

CO GCATGTCG^GACATTGG CACTACCAAGCAAAAGATGATGCfTG CTCGCCAAGGCTCTGCAA 1080 

WT ACTGGTCTTGCGGGCCCATTTAAAGGTGGA 1140 

CO ACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAAGGGAGGTCCATTGAAAGCTGCA 1140 

WT CAAACATGTTATAACTG TG G G AAG C CAG G ACATTT AT CT AG T CAATGTAG AG CAC CTAAA 1200 

CO CAAACATG T TATAATTG TGG GAAG C CAGGACAT TTAT CTAG T CAATGTAGAGCAC CTAAA 1200 

WT GTCTG T TTTAAATG TAAACAGCCTGGACATTT CTCAAAG CAATG CAGAAGTGTT C CAAAA 1260 

CO GTCTGTTTTAAATGTAAAC^GCCTGGACATTTCTCAAAGC^TGCAGAAGTGTTCCAAAA 1260 

WT AACG GGAAG CAAG G G G C T CAAGG GAGGC C CCAG AAACAAAC TTT CCCGATACAACAG AAG 1320 

CO AACGG GAAGCAAG G GG CTCAAG GGAGGCCCCAGAAACAAACT TT C CCGATACAACAGAAG 13 20 

WT AGT CAGCACAACAAAT CTGTTGTACAAGAGACTCCT CAGACT CAAAAT CTGTACCCAGAT 13 80 

CO AGTCAG CACAACAAAT CTGTTGTACAAGAGACT CCT CAGACT CAAAAT CTGTACCCAGAT 13 BO 

WT CTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAAGGATCA^ 1440 

CO CTGAGCGAAATAAAAAAGGAATAO^TGTCAAGGAGAAGGATa^TAGAGGATCTC^C 1440 

WT CTGGACAGTOTGTGGGAGTAACATATAATCTAGAGAAAAGGCCTACTACAATAGTATTAA 1500 

CO CTX3GACAGTTTCTGGGAGTAACATACAATCT 1500 

WT TTAATGATACTCCCTTAAATGTACTGTTAGACAC^GGAGCAGATACTTCAGT 1560 

CO TCAATGACAC C C CT CTTAATGTG CTG CTGGACACCGGAG CCGACAC CAGCGTTC T CAC TA 1560 

WT CTTGCACATrATAATAGGTTAAAATATAGAGGGAGAAAATATCAAGGGACGGGAATAATAG 1620 

CO CTG CT CAC TATAACAGACTGAAATACAGAGGAAG GAAATACCAGGG CACAGG CAT CATCG 1620 

WT GAGTGG G AGG AAATGTG GAAACATTTTCTACGCCTGTGACT 1680 

CO GCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACC^TCAAAAAGAAG 1680 

WT ACATTAAGAC3\AGAATCCTAGTGGCAGATATTC(^GTGACTATTTTGG 1740 

CO ACATTAAAACCAGAATGCTGGTCGCCGACATC^ 1740 

WT TTCAGGACTTAGGTGCAAAATTGGTTTTGG CACAGCT CTCCAAGGAAATAAAATTTAGAA 1800 

CO TCCAGKIACCTGGGCGCTAAACTOSTGCTGGCZACAACTG 1800 

WT AAATAGAGTTAAAAGAGGGCACAATGGGG CCAAAAATTC CTCAATGGCCACTCACTAAGG 1860 

CO AGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATCCCCCAGTGGCCCCTGACCTU^ 1860 

WT AG AAAC TAG AAG G GG C CAAAG AG ATAG T CCAAAGACTAT TGTCAGAG GG AAAAATATCAG 1520 

CO AGAAGCTTG AGG G CG CTAAGG AAAT CG TG CAG CG C CTG C TTTCTGAG GG CAAGATTAG CG 1920 

WT AAOCTAGTGACMTAATCCTTATAATTCACCCATATTTGTAATAAAAAAGAGGTCTGGCA 1980 

CO AGG CCAG CGACAATAACCCTTACAACAGCCCCATCTTTGTGATTAAGAAAAGGAGCGGCA 1980 
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WT AATGGAGGTTATTACAAGATCTGAGAGAATTAAACAAAACAGTACAAGTAGGAACGGAAA 2040 

CO AATG GAGAC TCCTG CAGGAC CTGAGGGAACT CAACAAGAC CG TC CAGGT CGGAA CTG AGA 2040 

WT TAT C CAGAGGATTG CC T CAC CCGG G AGGATTAATTAAATGTAAACACATGACTGTATTAG 2100 

CO TCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAAGCACATGACAGTCCTTG -2100 

WT ATATTGGAGATG<^TAraTCACTATACCCTTAGATCC^AGTTTAGACCATATACAGCTT 2160 

CO ACATTGGAGACGCTCATTTTACC^TCCCCCT'CX^TCCTGAATT^ 2 ISO 

WT TCAC TATT CCCTC CATTAAT CAT CAAGAACCAG AT AAAAGAT ATGTGTGGAAATGTT T AC 2220 

CO TTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTATGTGTGGAAGTGCCTCC 2220 

WT CACAAGGATTCGTGTTGAGCCC^TATATATATC^GAAAACATTACAGGAAATTTTACAA^ 2280 

CO CCCAGGGATTTGTGCTTAGCCCCTACATTTACC^GAAGACACTTCAAGAGATCCTCCAAC 2280 

WT CTTTTAGGGAAAGATATCCTGAAGTACAATTGTATCAATATATGGATGATTTGTTCATGG 2340 

CO . CTTTCCGCGAAAGATACCCAGAGGTTCAACTCTACCAATATATGGACGACCTGTTCATGG 2340 

WT GAAGTAATG GT T CTAAAAAACAACACAAAGAGTTAATCATAGAAT TAAG GG CGATCT TAC 2400 

CO GG T C CAACG GGT CTAAG AAGCAGCACAAGGAACT CAT CAT CGAACTGAG GG CAAT C CT C C 2400 

• i ' 

WT TGGAAAAGGGTTTTGAGACACCAGATGATAAATTACAAG AAG TG C CAC CTTATAG CTGGC 2460 

CO TGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGTT^ 2460 

WT TAGGTTATC7\ACOTTOTCCTGAAAAOTGGAAAGTA 2520 

CO TGGGCTAC CAG CTTTG C CCT GAAAACTGGAAAGTCCAG AAGATG CAGTTGGATATGGT CA 2520 

WT AGAATCCAACCCTTAATGATGTGCAAAAATTAATG G GGAATATAACATGGATG AG CTCAG 2580 

CO AGAACCCAACACTGAACXIACGTCC^GAAGCTCATGGGCAATATTACCTGGATGAGCTCCG 2580 

WT GGATCCCAGGGTTGACAGTAAAAC^CATTGCAGCTACTACTAAGGGATGTTTAGAGTTGA 2640 

CO G AAT CC CTGGGCT TAC CGTTAAG CA CATTG C CG CAACTACAAAAGGATG CCTGGAGTTGA 2640 

WT ATCAAAAAGTAATTTGGACGGAAGAGGCACAAAAAGAGTTAGAAGAAAATAATGAGAAGA 2700 

CO AC CAGAAGGT CATTTGGACAGAGGAAG CT CAG AAGGAACTGG AG GAGAATAATGAAAAGA 2700 

WT TTAAAAATGCTCAAGGGTTACAATATTATAATCCAGAAGAAGAAATGTTATGTGAGGTTG 2760 

CO TTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGAAATGTTGTGCGAGGTCG 2760 

WT AAATTACAAAAAATTATGAGGCAACTTATGTTATAAAACAATCACAAGGAATCCTATGGG 2820 

CO AAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTCCCAAGGCATCTT^ 2820 

WT CAG GTAAAAAGATT AT G AAGG CT AAT AAG GG ATGG T CAACAG T AAAAAATT TAATG T TAT 2880, 

C° CCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCX5TTAAAAATCTGATGCTCC 2880 

WT TGT TG CAACATGTGG CAACAGAAAG TATTACTAGAG TAG GAAAATG TCCAACG T TTAAGG 2940 

co TGCTCCAGCACGTOTCCACCGAGTCTATCACCCGCGTCGGCAAGTGCCCC^CCTTCAAAG 2940 

W T TAC<^TTTAC(^^GAGC^GTAATGTGGGAAATGa^AAAGGATGGTATTATTCTTGGC 3 000 

co TTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGGCTGGTACTACTCTTGGC 3000 
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WT TCC CAGAAATAGTATATACACATCAAGTAG TTCATGATGATTGGAGAATGAAATTGGTAG 3060 

CO TTCCCGAGAT CGTCTACAGCCACCAAGTGGTGCACGACX3ACTGGAGAATGAAG CTTGTCX5 * 3060 

WT AAGAAC C TACAT C7VGGAATAACAATAT ACACTG ATGGGGG AAAACAAAATGGAG AAGG AA 3120 

CO AGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAAGCAAAACGGAGAGGGAA 3120 

WT TAGCAGCTTATGTGACCAGTAATGGGAGAACTAAACAGAAAAGGTTAGGACCTGTCACTC 3180 

CO TCGCTGCATACGTCACATCTAAOSGCCGCACCAAGCT^AAAGAGGCTCGGCCCTGTCACTC 3180 

WT AT CAAGTTG CT GAAAGAATG G CAATACAAATGG CATTAG AGGATAC CAGAGATAAACAAG 3240 

CO ACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGACACTAGAGACAAGCAGG 3240 

WT TAAATATAGTAACTGATAGTTATTATTGTTGGAT^AAATATTACAGAAGGATTAGGTTTAG 3300 

CO TGAACATTGTGACTGAC^GCTACTACTGCTGGAAAAACATCACAGAGGGCCTTGGCCTGG 3300 

WT AAGGACCAC^U^GTCCTTGGTGGCOTATAATAOU^TATACGAGAAAAAGAGATAGTTT 3360 

CO AGGGACCC CAGTCTCCCTGGTGGC CTATCATCCAGAATATCCG CGAAAAGGAAATTGTCT 3360 

WT ATTTTGCTTGGGTACCTGGTC^CAAAGGGATATATGGTAATC^TTGGCAGATGAAGCCG 3420 

CO ATTTCGCCTGGGTGCCTCGACACAAAGGAATTTACGGCAACCAACTCGCCGATGAAGCCG 3420 

WT CAAAAATAAAAGAAGAAAT CATG CTAGCATACCAAG G CACACAAAT TAAAGAG AAAAG AG 3480 

CO CCAAAATTAAAGAGGAAATCATGCTTGCCTAC^GGGCACACAGATTAAGGAGAAGAGA^ 3480 

WT ATGAAGATGCAGGGTTTGACTTA*TGTGTTCCTTATGACATCATGATACCTGTATCTGACA 3540 • 

CO ACQ AGGACGCTGG CTTTGAC CTGTGTGT G C CATACGACATCATGATTCC CGTTAGCGACA 3540 

WT CAAAAAT CATACC CACAGATGTAAAAATTCAAGTTC CT CCTAATAGCTTTGGATGGGTCA 3600 

CO CAAAGATCAT T C CAAC CGATGT CAAGAT C CAG GTGC CACC CAATT CATTTGGTTGGGTGA 3600 

WT CTGGGAAATCATCAATGGCAAAACAGGGGTTATTAATTAATGGAGGAATT^ATTGATGAAG 3660 

CO CCGGAAAGTCCAGC^TGGCTAAGCAGGGTCTTCTGATTAACGGGGGAATCATTGATGAAG 3660 

WT ' G AT AT ACAG G AG AAATACAAGTG AT ATGTACTAATATTGGAAAAAGTAAT ATTAAATTAA 3720 

CO GATACAC CGGCGAAATCC^GGTGATCTGCACAAATATCGGCAAAAG C^LATATTA 3720 

WT TAGAGGGACAAAAATTTGCACAATTJ^TTATACTACAGCATCACTCAAATT^ 3780 

CO T CGAAGGGC^USAAGTTOGCTCAACTCATCATCCTCCAG CACCACAG CAATTCAAGJACAAC 3 780 

WT CTTGGGATGAAAATAAAATATCTCAGAGAGGGGATAAAGGATTTGGAAGTACAGGAGTAT 3 840 

CO CTTGGGACGAAAACAAGATTAG CCAGAGAGGTGACAAGGGCTTCGG CAG CACAGGTGTGT 3840 

WT TCTGGGTAGAAAATATTC^GGAAGac^GATGAACATGAGAATTGGCATACATC^Ca^A 3900 

CO T CTGGG TGGAG AACAT CCAGGAAG CA CAG GACGAG CACGAG AAT TGG CACAC CT C CCCTA 3900 

WT AGATATTGG CAAGAAATTATAAGATACCATTGACTGTAG CAAAACAGATAACTCAAGAAT 3 960 

CO AGATTTTGG CCCG CAATT ACAAGATCCCACTGACTGTGG CTAAGCAGATCACACAGGAAT 3960 

WT GTCCTCATTG CACTAAGCAAGGATCAGGACCTG CAGGTTGTGTCATG AG ATCTCCTAATC 4020 

CO GCCCCCACTG CACCAAACAAGGTTCTGGCCCCG CCGGCTG CGTGATGAGGTCCCCCAAT C 4020 
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WT ATTGGCAG G CAGATTGCACACATTTGGACAATAAGATAATATTGACTTTT GTAGAGT CAA 4080 

CO ACTGGCAGGCAGATTGCACCCACCT CGACAACAAAATTATCCTGACCTTCGTGGAGAGCA ' 4080 

WT ATTCAGGATACATACATG CTACATTATTGTCAAAAGAAAAT G CATTATGTACTTCATTGG 4140 

CO ATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGC^ 4140 

WT CTATTTTAGAATGGG CAAGAT TGTTTT CAC CAAAGTCCTT ACACACAGATAACG G CACTA 4200 

CO CAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCCCTGC^ 4200 

WT ATTTT GT GG CAGAAC CAGTTGTAAAT T TGTTG AAGT TC C TAAAGATAG CACATAC CACAG 4260 

CO ACTTTGTGGCTGAACCTGTGGTGAATC TGCTGAAGTTCCTGAAAATCGCCCACAC CACTG 4260 

WT GAATACCATATC^TCCAGAAAGTCAGGGTATTGTAGAAAGGGCAAATAGGACCTTGAAAG 4320 

CO GCATTCCCTATCACCCTCAAAGCCAGGGCATTGTC 4320 

WT AGAAGATTCAAAGTC^TAGAGACAACACTCAAAC^CTGGAGGCAGCTT^ 4380 

CO AAAAGATCCAATCTCACAGAGACyVATACACA^ 4380 

WT TCATTACTTGTAACAAAGGGAGGGAAAGTATGGGAGGACAGACACCATGGGAAGTATTTA 4440 

CO TTAT CAC CTG CAACAAAGGAAGAGAAAG CATGGG CGG CCAGACC CCCTGGGAGGT CTT CA 4440 

• i 

WT TCACT AATCAAGCACAAGTAATACATGAGAAACT TTTAC TACAG CAAG CACAA'T C CTCCA 4500 

CO TCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTCTTGC^ 4500 

WT AAAAATTT TGTTTT TACAAAATCC CTGGTGAACATGATTG G AAGGGACCTACTAGGGTGC 4560 ■ 

CO AAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGACT'GGAAAGGTCCTACAAGAGTTT 4560 

WT TGTG GAAGGG TGATGG TG CAGTAGTAGTTAATGATG AAGGAAAGGGAATAATTG CTGT AC 4620 

CO TGTGGAAAGGAGACGGCGCAGTTGTGGTG^03ATGAGGGCAAGGGGATCATC 4620 

WT CAT TAACCAGGACT AAGT TACTAATAAAAC CAAATTGA 4658 

CO CCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGA 4658 
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SEOUENCF LISTING PART OF THE DESCRIPTION 



SEQ. ID. NO. 1 - Wild type gagpol sequence for strain HXB2 (accession no. K03455) 

ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAGAATTAG ATCGATGGGA AAAAATTCGG 60 
TTAAGGCCAG GGGGAAAGAA AAAATATAAA •TTAAAACATA TAGTATGGGC AAGCAGGGAG 120 
CTAGAACGAT TCGCAGTTAA TCCTGGCCTG TTAGAAACAT CAGAAGGCTG TAGACAAATA 180 
CTGGGACAGC TACAACCATC CCTTCAGACA GGATCAGAAG AACTTAGATC ATTATATAAT 240 
ACAGTAGCAA CCCTCTATTG TGTGCATCAA AGGATAGAGA TAAAAGACAC CAAGGAAGCT 300 
TTAGACAAGA TAGAGGAAGA GCAAAACAAA AGTAAGAAAA AAGCACAGCA AGCAGCAGCT 360 
GACACAGGAC ACAGCAATCA GGTCAGCCAA AATTACCCTA TAGTGCAGAA CATCCAGGGG 420 
CAAATGGTAC ATCAGGCCAT ATCACCTAGA ACTTTAAATG CATGGGTAAA AGTAGTAGAA 480 . 
GAGAAGGCTT TCAGCCCAGA AGTGATACCC ATGTTTTCAG CATTATCAGA AGGAGCCACC 540 
CCACAAGATT TAAACACCAT GCTAAACACA GTGGGGGGAC ATCAAGCAGC CATGCAMTG 600 
TTAAAAGAGA CCATCAATGA GGAAGCTGCA GAATGGGATA GAGTGCATCC AGTGCATGCA 660 
GGGCCTATTG. CACCAGGCCA GATGAGAGAA CCAAGGGGAA GTGACATAGC AGGAACTACT 720 
. AGTACCCTTC AGGAACAAAT AGGATGGATG ACAAAIAATC-.-CACCTATCCC AGTAGGAGAA 780 . 
ATTTATAAAA GATGGATAAT CCTGGGATTA AATAAAATAG TAAGAATGTA TAGCCCTACC 840 
AGCATTCTGG ACATAAGACA AGGACCAAAG GAACCCTTTA GAGACTATGT AGACCGGTTC 900 
TATAAAACTC TAAGAGCCGA GCAAGCTTCA CAGGAGGTAA AAAATTGGAT GACAGAAACC 960 
TTGTTGGTCC AAAATGCGAA CCCAGATTGT AAGACTATTT TAAAAGCATT GGGACCAGCG 1020 
GCTA CACTA G AAGAAATGAT GACAGCATGT CAGGGAGTAG GAGGACCCGG CCATAAGGCA 1080 
AGAG TTTTG G CTGAAGCAAT GAGCCAAGTA ACAAATTCAG CTACCATAAT GATGCAGAGA 1140 
GGCAATTTTA GGAACCAAAG AAAGATTGTT AAGTGTTTCA ATTGTGGCAA AGAAGGGCAC 1200 
ACAGCCAGAA ATTGCAGGGC CCCTAGGAM AAGGGCTGTT GGAAATGTGG AAAGGAAGGA 1260 
CACCAAATGA AAGATTGTAC TGAGAGACAG GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 
TACAAGGGAA GGCCAGGGM TTTTCTTCAG AGCAGACCAG AGCCAACAGC CCCACCAGAA 1380 
GAGAGCTTCA GGTCTGGGGT.AGAGACAACA ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGG TCACTCTTTG GCAACGACCC CTCGTCACAA 1500 
TAAAGATAGG GGGGCAACTA AAGGAAGCTC TATTAGATAC AGGAGCAGAT GATACAGTAT 1560 
TAGAAGAAAT GAGTTTGCCA GGAAGATGGA AACCAAAAAT GATAGGGGGA ATTGGAGGTT 1620 
TTATCAAAGT AAGACAGTAT GATCAGATAC TCATAGAAAT CTGTGGACAT AAAGCTATAG 1680 
GTACAGTATT AGTAGGACCT ACACCTGTCA ACATAATTGG AAGAAATCTG TTGACTCAGA 1740 
TTGGTTGCAC TTTAAATTTT CCCATTAGCC CTATTGAGAC TGTACCAGTA AAATTAAAGC 1800 
CAGGAATGGA TGGCCCAAAA GTTAAACAAT GGCCATTGAC AGAAGAAAAA ATAAAAGCAT 1860 
TAGTAGAMT TTGTACAGAG ATGGAAAAGG AAGGGAAAAT TTCAAAAATT GGGCCTGAAA 1920 
ATCCATACAA TACTCCAGTA TTTGCCATAA AGAAAAAAGA CAGTACTAAA TGGAGAAAAT 1980 
TAGTAGATTT CAGAGAACTT AATAAGAGAA CTCAAGACTT CTGGGAAGTT CAATTAGGAA 2040 
TACC ACATC C CGCAGGGTTA AAAAAGAAAA AATCAGTAAC AGTACTGGAT GTGGGTGATG 2100 
CATATTTTTC AGTTCCCTTA GATGAAGACT TCAGGAAGTA TACTGCATTT ACCATACCTA 2160 
GTATAAACAA TGAGACACCA GGGATTAGAT ATCAGTACAA TGTGCTTCCA CAGGGATGGA 2220 
AAGGATCACC AGCAATATTC CAAAGTAGCA TGACAAAAAT CTTAGAGCCT TTTAGAAAAC 2280 
AAAATCCAGA CATAGTTATC TATCAATACA TGGATGATTT GTATGTAGGA TCTGACTTAG 2340 
AAATAGGGCA GCATAGAACA AAAATAGAGG AGCTGAGACA ACATCTGTTG AGGTGGGGAC 2400 
TTACCACACC AGACAAAAAA CATCAGAAAG AACCTCCATT CCTTTGGATG GGTTATGAAC 2460 
TCCATCCTGA TAAATGGACA GTACAGCCTA TAGTGCTGCC AGAAAAAGAC AGCTGGACTG 2520 
TCAATGACAT ACAGAAGTTA GTGGGGAAAT TGAATTGGGC AAGTCAGATT TACCCAGGGA . 2580 



l 



WO 01/79518 



PCT/GB01/01784 



TTAAAGTAAG GCAATTATGT AAACTCCTTA 
CACTMCAGA AGAAGCAGAG CTAGAACTGG 
TACATGGAGT GTATTATGAC CCATCAAAAG 
AAGGCCMTG GACATATCAA ATTTATCAAG 
ATGCAAGAAT GAGGGGTGCC CACACTAATG 
AAATAACCAC AGAAAGCATA GTAATATGGG 
AAAAGGAAAC ATGGGAAACA TGGTGGACAG 
GGGAGTTTGT TAATACCCCT CCCTTAGTGA 
TAGTAGGAGC AGAAACCTTC TATGTAGATG 
AAGCAGGATA TGTTACTAAT AGAGGAAGAC 
ATCAGAAGAC TGAGTTACAA GCAATTTATC 
ACATAGTAAC AGACTCACAA TATGCATTAG 
AATCAGAGTT AGTCAATCAA ATAATAGAGC 
CATGGGTACC AGCACACAAA GGAATTGGAG 
CTGGAATCAG GAAAGTACTA TTTTTAGATG 
AATATCACAG TAATTGGAGA GCAATGGCTA 
AAGAAATAGT AGCCAGCTGT GATAAATGTC 
TAGACTGTAG TGCAGGAATA TGGCAACTAG 
TGGTAGCAGT TCATGTAGCC AGTGGATATA 
GGCAGGAAAC AGCATATTTT CTTTTAAAAT 
ATACTGACAA TGGCAGCAAT TTCACCGGTG 
GMTCAAGCA GGAATTTGGA ATTCCCTACA 
TGAATAAAGA ATTAAAGAAA ATTATAGGAC 
CAGCAGTACA AATGGCAGTA TTCATCCACA 
ACAGTGCAGG GGAAAGAATA GTAGACATAA 
AAAAACAAAT TACAAAAATT CAAAATTTTC 
TTTGGAAAGG ACCAGCAAAG CTCCTCTGGA 
ATAGTGACAT AAAAGTAGTG CCAAGAAGAA 
AGATGGCAGG TGATGATTGT - GTGGCMGTA 



GAGGAACCAA AGCACTAACA GAAGTAATAC 2640 
CAGAAAACAG AGAGATTCTA AAAGAACCAG 2700 
ACTTAATAGC AGAAATACAG AAGCAGGGGC 2760 
AGCCATTTAA AAATCTGAAA ACAGGAAAAT. 2820 
ATGTAAAACA ATTAACAGAG GCAGTGCAAA 2880 
GAAAGACTCC TAAATTTAAA CTGCCCATAC 2940 
AGTATTGGCA AGCCACCTGG ATTCCTGAGT 3000 
AATTATGGTA CCAGTTAGAG AAAGMCCCA 3060 
GGGCAGCTAA CAGGGAGACT AAATTAGGAA 3120 
AAAAAGTTGT CACCCTAACT GACACAACAA 3180 
TAGCTTTGCA GGATTCGGGA TTAGAAGTAA 3240 
GAATCATTCA AGCACAACCA GATCAAAGTG 3300 
AGTTAATAAA AAAGGAAAAG GTCTATCTGG 3360 - 
GAAATGAACA AGTAGATAAA TTAGTCAGTG 3420 
GAATAGATAA GGCCCAAGAT GAACATGAGA 3480 
GTGATTTTAA CCTGCCACCT GTAGTAGCAA 3540 
AGCTAAAAGG AGAAGCCATG CATGGACAAG 3600 
ATTGTACACA TTTAGAAGGA AAAGTTATCC 3660 
TAGAAGCAGA AGTTATTCCA GCAGAAACAG 3720 
TAGCAGGAAG ATGGCCAGTA AAAACAATAC 3780 
CTACGGTTAG GGCCGCCTGT TGGTGGGCGG 3840 
ATCCCCAAAG TCAAGGAGTA GTAGAATCTA 3900 
AGGTAAGAGA TCAGGCTGAA CATCTTAAGA 3960 
ATTTTAAAAG AAMGGGGGG ATTGGGGGGT 4020 
TAGCAACAGA CATACAAACT AAAGAATTAC 4080 
GGGTTTATTA CAGGGACAGC AGAAATTCAC 4140 
AAGGTGAAGG GGCAGTAGTA ATACAAGATA 4200 
AAGCAAAGAT CATTAGGGAT TATGGAAAAC 4260 
GACAGGATGA GGATTAG 4307 
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SEQ. ID. NO. I - pSYNGP 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCC^GTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATC71CTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGC^CCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 

AGGTGT CCACTC CCAGTTCAATTACAG CTCTTAAGGCTAGAGTACTTAATACGACTCACT 

ATAGGCTAGCCTCGAGAATTCGCCACCATGGGCGCCCGCGCCAGCGTGCTGTCGGGCGGC 

GAGCTGGACCGCTGGGAGAAGATCCGCCTOCGCCCCGGCGGCAAAAAGAAGTACAAGCTG 

AAGCACATCGTGTGGGCCAGCCGCGAACTGGAGCGCTTCGCCGTGAACCCCGGGCTCCTG 

GAGACCAGCGAGGGGTGCCGCCAGATCCTCGGCCAACTGCAGCCCAGCCTGCAAACCQGC 

AGCGAGGAGCTGCGCAGCGTGTACAACACCGTGGCCACGCTGTACTGCGTCCACCAGCGC- : 

ATCGAAATCAAGGATACGAAAGAGGCCCTGGATAAAATCGAAGAGGAACAGAATAAG^GC 

AAAAAGAAGGCCCAACAGGCCGCCGCGGACACCGGACACAGCAACCAGGTCAGCCAGAAC 

TACCCCATCGTGCAGAACATCCAGGGGCAGATGGTGCACCAGGCCATCTCCCCCCGCACG 

CTGAACGCCTGGGTGAAGGTGGTGGAAGAGAAGGCTTTTAGCCCGGAGGTGATACCCATG . 

TTCTCAGCCCTGTCAGAGGGAGCCAC CC CCCAAGATCTGAACAC CATGCTCAACACAGTG 

GGGGGACACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAATGAGGAGGCTGCCGAA 

TGGGATCGTGTGCATCCGGTGCACGCAGGGCCCATCGCACCGGGCCAGATGCGTGAGCCA 

CGGGGCTCAGACATCGCCGGAACGACTAGTACCCTTCAGGAACAGATCGGCTGGATGACC 

AACAACCCACCCATCCCGGTGGGAGAAATCTACAAACGCTGGATCATCCTGGGCCTGAAC 

AAGATCGTGCG CATGTATAGCC CTACCAG CATCCTGGACATC CG CCAAGGCCCGAAGGAA 

CCCTTTCGCGACTACGTGGACCGGTTCTACAAAACGCTCCGCGCCGAGCAGGCTAGCCAG 

GAGGTGAAGAACTGGATGACCGAAACCCTGCTGGTCCAGAACGCGAACCCGGACTGCAAG 

ACGATCCTGAAGG C CCTGGG CCC AG CGG C TACCCTAGAGGAAATGATGACCGC CTGTCAG 

GGAGTGGGCGGACCCGGCCACAAGGCACGCGTCCTGGCTGAGGCCATGAGCCAGGTGACC 

AACTCCGCTACCATCATGATGCAGCGCGGCAACTTTCGGAACCAACGCAAGATCGTCAAG 

TGCTT CAACTGTGGC AAAGAAGGGCACACAGCC CG CAACTGCAGGGCCCCTAGGAAAAAG 

GGCTGTTGGAAATGTGGAAAGGAAGGACACCAAATGAAAGATTGTACTGAGAGACAGGCT 

AATTTTTTAGGGAAGATCTGGCCTTCCCACAAGGGAAGGCCAGGGAATTTTCTTCAGAGC 

AGACCAGAGCCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAAGAGACAACAACT 

CCCTCTCAGAAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAGCTTCCCTCAGATCA 

CTCTTTGGCAGCGACCCCTCGTCACAATAAAGATAGGGGGGCAGCTCAAGGAGGCTCTCC 

TGGACACCGGAGCAGACGACACCGTGCTGGAGGAGATGTCGTTGCCAGGCCGCTGGAAGC 

CGAAGATGATCGGGGGAATCGGCGGTTTCATCAAGGTGCGCCAGTATGACCAGATCCTCA 

TCGAAATCTGCGGCCACAAGGCTATCGGTACCGTGCTGGTGGGCCCCACACCCGTCAACA 

TCATCGGACGCAACCTGTTGACGCAGATCGGTTGCACGCTGAACTTCCCCATTAGCCCTA 

TCGAGACGGTACCGGTGAAGCTGAAGCCCGGGATGGACGGCCCGAAGGTCAAGCAATGGC 

CATTGACAGAGGAGAAGATCAAGGCACTGGTGGAGATTTGCACAGAGATGGAAAAGGAAG 

GGAAAATCTCCAAGATTGGGCCTGAGAACCCGTACAACACGCCGGTGTTCGCAATCAAGA 

AGAAGGACTCGACGAAATGG CGC AAG CTGGTGGACTTC CGCGAGC TGAACAAGCGCACGC 

AAGACTTCTGGGAGGTTCAGCTGGGCATCCCGCACCGCGCAGGGCTGAAGAAGAAGAAAT 

CCGTGACCGTACTGGATGTGGGTGATGCCTACTTCTCCGTTCCCCTGGACGAAGACTTCA 

GGAAGTACACTGCCTTCACAATCCCTTCGATCAACAACGAGACACCGGGGATTCGATATC 

AGTACAACGTGCTGCCCCAGGGCTGGAAAGGCTCTCCCGCAATCTTCCAGAGTAGCATGA 

CCAAAATCCTGGAGCCTTTCCGCAAACAGAACCCCGACATCGTCATCTATCAGTACATGG 

ATGACTTGTACGTGGGCTCTGATCTAGAGATAGGGCAGCAGCGCACCAAGATCGAGGAGC 

TGCGCCAGCACCTGTTGAGGTGGGGACTGACCACACCCGACAAGAAGCACCAGAAGGAGC 
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CTCCCTTCCTCTGGATGGGTTACGAGCTGCACCCTGACAAATGGAGCGTGCAGCCTATCG 

TGCTGCCAGAGAAAGACAGCTGGACTGTCAACGACATACAGAAGCTGGTGGGGAAGTTGA 

ACTGGGCCAGTCAGATTTACCCAGGGATTAAGGTGAGGCAGCTGTGCAAACTCCTCCGCG 

GAACCAAGGCACTCACAGAGGTGATCCCCCTAACCGAGGAGGCCGAGCTCGAACTGGCAG 

AAAACCGAGAGATCCTAAAGGAGCCCGTGCACGGCGTGTACTATGACCCCTCCAAGGACC 

TGATCGCCGAGATCCAGAAGCAGGGGCAAGGCCAGTGGACCTATCAGATTTACCAGGAGC 

CCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGGATGAGGGGTGCCCACACTAACGACG 

TCAAGCAGCTGACCGAGGCCGTGCAGAAGATCACCACCGAAAGCATCGTGATCTGGGGAA 

AGAGTCCTAAGTTCAAGCTGCCGATCCAGAAGGAAACCTGGGAAACCTGGTGGACAGAGT 

ATTGGCAGGCCACCTGGATTCCTGAGTGGGAGTTCGTCAACACCCCTCCCCTGGTGAAGC 

TGTGGTACCAGCTGGAGAAGGAGCCCATAGTGGGCGCCGAAACCTTCTACGTGGATGGGG 

C CGCTAACAGGGAGACTAAGCTGG GCAAAGC CGGATACGTCAC TAACCGGGG CAGACAGA 

AGGTTGTCACCCTCACTGACACCACCAACCAGAAGACTGAGCTGCAGGCCATTTACCTCG 

CTTTGCAGGACTCGGGCCTGGAGGTGAACATCGTGACAGACTCTCAGTATGCGCTGGGCA 

TCATTCAAGCCCAGCCAGACCAGAGTGAGTCCGAGCTGGTCAATCAGATCATCGAGCAGC 

TGATCAAGAAGGAAAAGGTCTATCTGGCCTGGGTACCCGCCCACAAAGGCATTGGCGGCA 

ATGAGCAGGTCGACAAGCTGGTCTCGGCTGGCATCAGGAAGGTGCTATTCCTGGATGGCA 

TCGACAAGGCCCAGGACGAGCACGAGAAATACCACAGCAACTGGCGGGCCATGGCTAGCG 

ACTTCAACCTGCCCCCTGTGGTGGCCAAAGAGATCGTGGCCAGCTGTGACAAGTGTCAGC 

TCAAGGGCGAAGCCATGCATGGCCAGGTGGACTGTAGCCCCGGCATCTGGCAACTCGATT 

GCACCCATCTGGAGGGCAAGGTTATCCTGGTAGCCGTCCATGTGGCCAGTGGCTACATCG 

AGGCCGAGGTCATTCCCGCCGAAACAGGGCAGGAGACAGCCTACTTCCTCCTGAAGCTGG 

CAGG C CGGTGG C CAGTGAAGACCATCCATACTGACAATGG CAGCAATTT CACCAGTGCTA 

CGGTTAAGGCCGCCTGCTGGTGGGCGGGAATCAAGCAGGAGTTCGGGATCCCCTACAATC 

CCCAGAGTCAGGGCGTCGTCGAGTCTATGAATAAGGAGTTAAAGAAGATTATCGGCCXGG 

TCAGAGATCAGGCTGAGCATCTCAAGACCGCGGTCCAAATGGCGGTATTCATCCACAATT 

TCAAGCGGAAGGGGGGGATTGGGGGGTACAGTGCGGGGGAGCGGATCGTGGACATCATCG 

CGACCGACATCCAGACTAAGGAGCTGCAAAAGCAGATTACCAAGATTCAGAATTTCCGGG 

TCTACTACAGGGAGAGCAGAAATCCCCTCTGGAAAGGCCCAGCGAAGCTCCTCTGGAAGG 

GTGAGGGGGCAGTAGTGATCCAGGATAATAGCGACATCAAGGTGGTGCCCAGAAGAAAGG 

CGAAGATCATTAGGGATTATGGCAAACAGATGGCGGGTGATGATTGCGTGGCGAGCAGAC 

AGGATGAGGATTAGGAATTGGGCTAGAGCGGCCGCTTCCCTTTAGTGAGGGTTAATGCTT 

CGAGCAGACATGATAAGATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGA 

AAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGC 

TGCAATAAAC^^GTTAACAAGAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAG 

ATGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACAAATGTGGTAAAATCCGATAAGGA 

TCGATCCGGGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGC 

GCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGT 

GGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTT 

CTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCT 

CCCTTTAGGGTTGCGATTTAGAGCTTTACGGCACCTCGAGCGCAAAAAACTTGATTTGGG 

TGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGA 

GTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTC 

GGTCTATT CTTTTGATTTATAAG GGATTTTGC CGAT TTCGGC CTATTGGTTAAAAAATGA 

GCTGATTTAACAAATATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTCGCC 

TGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGCGGATCTG 

CGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGG 

CGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCGCCAGGCTCCCC 

AGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTC 

CCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAT 

AGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCGCATTCTCC 

GCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGA 

GCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGA 

TTCTTCTGACACAACAGTCTCGAAC TTAAGG CTAGAG CCACCATGATTG AACAAGATGGA 

TTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAA 

CAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTT 

CTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGG 

CTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAA 

GCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCAC 

CTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTT 

GATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACT 
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CGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCG 

CCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTG 

ACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTC 

ATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGT 

GATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATC 

GCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCG 

GGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGATGGCCGC 

AATAAAATATCTTTATTTTCATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCG 

ATAAGGATCCGCGTATGGTGCAGTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGC 

CAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCA 

TCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCG 

TCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAAT 

GTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGA 

ACCC CTATTTGTTTAT TTTTCTAAATACATTCAAATATGTATC CG CTCATGAGACAATAA 

CCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGT 

GTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACG 

CTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTG 

GATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATG 

AGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAG 

CAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTGACA 

GAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGGCATAACCATG 

AGTGATAAGACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACC 

GCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTG 

AATGAAG C CATAC CAAACGACGAGCGTGAC AC CACGATGCCTGTAGCAATGGCAACAACG 

TTGCGCAAACTATTAACTGGCGAAC TACTT ACTCTAGCTTC CCGGCAACAATTAATAGAC 

TGGATGGAGGCGGATAAAGTTGCAGGAC C ACTTCTG CGCTCGGCCCTTCCGG CTGG CTGG 

TTTATTGCTGATAAATCTGGAGC CGGTGAG CGTGGGTCTC GCGGTATCATTGCAGCACTG 

GGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACT 

ATGGATGAACGAAATAGACAGATGGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAA 

CTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTT 

AAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAG ' 

TTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCT 

TTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTT 

TGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCG 

CAG ATAC CAAATACTGTCCTTCTAGTGTAGCC GTAGTTAGGC CACC AC TTGAAGAACTCT 

GTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGC 

GATAAGT CGTGTCTTAC CGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGC AGCGG 

TCGGGCTGAACGGGGGGTTCGTGCACACAG CC CAGCTTGGAG CGAACGACCTACAC CGAA 

CTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCG 

GACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGG 

GGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGA 

TTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTT 

TTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGGCTCGACAGATCT 
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SEQ, ID. NO. 3- Envelope Gene from HIV-1 MN (Genbank accession no. Ml 7449) 

ATGAGAGTGA AGGGGATCAG GAGGAATTAT CAGCACTGGT GGGGATGGGG CACGATGCTC 60 
CTTGGGTTAT TAATGATCTG TAGTGCTACA GAAAAATTGT GGGTCACAGT CTATTATGGG 120 
GTACCTGTGT GGAAAGAAGC AACCACCACT CTATTTTGTG CATCAGATGC TAAAGCATAT 180 
GATACAGAGG TACATAATGT TTGGGCCACA CAAGCCTGTG TACCCACAGA CCCCAACCCA 240 
CAAGAAGTAG AATTGGTAAA TGTGACAGAA AATTTTAACA TGTGGAAAAA TAACATGGTA 300 
GAACAGATGC ATGAGGATAT AATCAGTTTA TGGGATCAAA GCCTAAAGCC ATGTGTAAAA 360 
TTAACCCCAC TCTGTGTTAC TTTAAATTGC ACTGATTTGA GGAATACTAC TAATACCAAT 420 
AATAGTACTG CTAATAACAA TAGTAATAGC GAGGGAACAA TAAAGGGAGG AGAAATGAAA 480 
AACTGCTCTT TCAATATCAC CACAAGCATA AGAGATAAGA TGCAGAAAGA ATATGCACTT 540 
CTTTATAAAC TTGATATAGT ATCAATAGAT AATGATAGTA CCAGCTATAG GTTGATAAGT 600 
TGTAATACCT CAGTCATTAC ACAAGCTTGT CCAAAGATAT CCTTTGAGCC AATTCCCATA 660 
CACTATTGTG CCCCGGCTGG TTTTGCGATT CTAAAATGTA ACGATAAAAA GTTCAGTGGA 720 
AAAGGATCAT GTAAAAATGT CAGCACAGTA CAATGTACAC ATGGAATTAG GCCAGTAGTA 780 
TCAACTCAAC TGCTGTTAAA TGGCAGTCTA GCAGAAGAAG AGGTAGTAAT TAGATCTGAG 840 
AATTTCACTG ATAATGCTAA AACCATCATA GTACATCTGA ATGAATCTGT ACAAATTAAT 900 
TGTACAAGAC CCAACTACAA JAAAAGAAAA AGGATACATA TAGGACCAGG GAGAGCATTT 960 
TATACAACAA AAAATATAAT AGGAACTATA AGACAAGGAC ATTGTAACAT TAGTAGAGCA 1020 
AAATGGAATG ACACTTTAAG ACAGATAGTT AGCAAATTAA AAGAACAATT TAAGAATAAA 1080 
ACAATAGTCT TTAATCAATC CTCAGGAGGG GACCCAGAAA TTGTAATGCA CAGTTTTAAT 1140 
TGTGGAGGGG AATTTTTCTA CTGTAATACA TCACCACTGT TTAATAGTAC TTGGAATGGT 1200 
AATAATACTT GGAATAATAC TACAGGGTCA AATAACAATA TCACACTTCA ATGCAAAATA 1260 
AAACAAATTA TAAACATGTG GCAGGAAGTA GGAAAAGCAA TGTATGCCCC TCCCATTGAA 1320 
GGACAAATTA GATGTTCATC AAATATTACA GGGCTACTAT TAACAAGAGA TGGTGGTAAG 1380 
GACACGGACA CGAACGACAC CGAGATCTTC AGACCTGGAG GAGGAGATAT GAGGGACAAT 1440 
TGGAGAAGTG AATTATATAA ATATAAAGTA GTAACAATTG AACCATTAGG AGTAGCACCC 1500 
ACCAAGGCAA AGAGAAGAGT GGTGCAGAGA GAAAAAAGAG CAGCGATAGG AGCTCTGTTC 1560 
CTTGGGTTCT TAGGAGCAGC AGGAAGCACT ATGGGCGCAG CGTCAGTGAC GCTGACGGTA 1620 
CAGGCCAGAC TATTATTGTC TGGTATAGTG CAACAGCAGA ACAATTTGCT GAGGGCCATT 1680 
GAGGCGCAAC AGCATATGTT GCAACTCACA GTCTGGGGCA TCAAGCAGCT CCAGGCAAGA 1740 
GTCCTGGCTG TGGAAAGATA CCTAAAGGAT CAACAGCTCC TGGGGTTTTG GGGTTGCTCT 1800 
GGAAAACTCA TTTGCACCAC TACTGTGCCT TGGAATGCTA GTTGGAGTAA TAAATCTCTG 1860 
GATGATATTT GGAATAACAT GACCTGGATG CAGTGGGAAA GAGAAATTGA CAATTACACA 1920 
AGCTTAATAT ACTCATTACT AGAAAAATCG CAAACCCAAC AAGAAAAGAA TGAACAAGAA 1980 
TTATTGGAAT TGGATAAATG GGCAAGTTTG TGGAATTGGT TTGACATAAC AAATT GGCTG 2040 
TGGTATATAA AAATATTCAT AATGATAGTA GGAGGCTTGG TAGGTTTAAG AATAGTTTTT 2100 
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GCTGTACTTT CTATAGTGAA TAGAGTTAGG CAGGGATACT CACCATTGTC GTTGCAGACC 2160 
CGCCCCCCAG TTCCGAGGGG ACCCGACAGG CCCGAAGGAA TCGAAGAAGA AGGTGGAGAG 2220 
AGAGACAGAG ACACATCCGG TCGATTAGTG CATGGATTCT TAGCAAnAT CTGGGTCGAC 2280 
CTGCGGAGCC TGTTCCTCTT CAGCTACCAC CACAGAGACT TACTCTTGAT TGCAGCGAGG. 2340 
ATTGTGGAAC TTCTGGGACG CAGGGGGTGG GAAGTCCTCA AATATTGGTG GAATCTCCTA 2400 
CAGTATTGGA GTCAGGAACT AAAGAGTAGT GCTGTTAGCT TGCTTAATGC CACAGCTATA 2460 
GCAGTAGCTG AGGGGACAGA TAGGGTTATA GAAGTACTGC AAAGAGCTGG TAGAGCTATT 2520 . 
CTCCACATAC CTACAAGAAT AAGACAGGGC TTGGAAAGGG CTTTGCTATA A 2571 



SEQ. I.D. NO.1 - SYNgp-160mn - codon optimised env sequence - 

ATGAGGGTGA AGGGGATCCG CCGCAACTAC CAGCACTGGT GGGGCTGGGG CACGATGCTC 60 • 
CTGGGGCTGC TGATGATCTG CAGCGCCACC GAGAAGCTGT GGGTGACCGT GTACTACGGC 120 
GTGCCCGTGT GGAAGGAGGC CACCACCACC CTGTTCTGCG CCAGCGACGC CAAGGCGTAC 180 
GACACCGAGG TGCACAACGT GTGGGCCACC CAGGCGTGCG TGCCCACCGA CCCCAACCCC 240 
CAGGAGGTGG AGCTCGTGAA CGTGACCGAG AACTTCAACA TGTGGAAGAA CAACATGGTG 300 
GAGCAGATGC ATGAGGACAT CATCAGCCTG TGGGACCAGA GCCTGMGCC CTGCGTGAAG 360 
CTGACCCCCC TGTGCGTGAC CCTGAACTGC ACCGACCTGA GGAACACCAC CAACACCAAC 420 
AACAGCACCG CCAACAACAA CAGCAACAGC GAGGGCACCA TCAAGGGCGG CGAGATGAAG 480 
AACTGCAGCT TCAACATCAC CACCAGCATC CGCGACAAGA TGCAGAAGGA GTACGCCCTG 540 
CTGTACAAGC TGGATATCGT GAGCATCGAC AACGACAGCA CCAGCTACCG CCTGATCTCC 600 
TGCAACACCA GCGTGATCAC CCAGGCCTGC CCCAAGATCA GCTTCGAGCC GATCCCCATC 660 
CACTACTGCG CCCCCGCCGG CTTCGCCATC CTGAAGTGCA ACGACAAGAA GTTCAGCGGC 720 
AAGGGCAGCT GCAAGAACGT GAGCACCGTG CAGTGCACCC ACGGCATCCG GCCGGTGGTG 780 
AGCACCCAGC TCCTGCTGAA CGGCAGCCTG GCCGAGGAGG AGGTGGTGAT CCGCAGCGAG 840 
AACTTCACCG ACAACGCCAA GACCATCATC GTGCACCTGA ATGAGAGCGT GCAGATCAAC 900 
TGCACGCGTC CCAACTACAA CAAGCGCAAG CGCATCCACA TCGGCCCCGG GCGCGCCTTC 960 
TACACCACCA AGMCATCAT- CGGCACCATC CGCCAGGCCC ACTGCAACAT CTCTAGAGCC 1020 
AAGTGGAACG ACACCCTGCG CCAGATCGTG AGCAAGCTGA AGGAGCAGTT CAAGAACAA.G 1080 
ACCATCGTGT' TCAACCAGAG CAGCGGCGGC GACCCCGAGA TCGTGATGCA CAGCTTCAAC 1140 
TGCGGCGGCG AATTCTTCTA CTGCMCACC AGCCCCCTGT TCAACAGCAC CTGGAACGGC 1200 
AACAACACCT GGAACAACAC CACCGGCAGC AACAAC.AATA TTACCCTCCA GTGCAAGATC 1260 
AAGCAGATCA TCAACATGTG GCAGGAGGTG GGCAAGGCCA TGTACGCCCC CCCCATCGAG 1320 
GGCCAGATCC GGTGCAGCAG CAACATCACC GGTCTGCTGC TGACCCGCGA CGGCGGCAAG 1380 
GACACCGACA CCAACGACAC CGAAATCTTC CGCCCCGGCG GCGGCGACAT GCGCGACAAC 1440 
TGGAGATCTG AGCTGTACAA GTACAA.GGTG GTGACGATCG AGCCCCTGGG CGTGGCCCCC 1500 
ACCAAGGCCA AGCGCCGCGT GGTGCAGCGC GAGAAGCGGG CCGCCATCGG CGCCCTGTTC 1560 
CTGGGCTTCC TGGGGGCGGC GGGCAGCACC ATGGGGGCCG CCAGCGTGAC CCTGACCGTG 1620 
CAGGCCCGCC TGCTCCTGAG CGGCATCGTG CAGCAGCAGA ACAACCTCCT CCGCGCCATC 1680 
GAGGCCCAGC AGCATATGCT CCAGCTCACC GTGTGGGGCA TCAAGCAGCT CCAGGCCCGC 1740 
GTGCTGGCCG TGGAGCGCTA CCTGAAGGAC CAGCAGCTCC TGGGCTTCTG GGGCTGCTCC 1800 
GGCAAGCTGA TCTGCACCAC CACGGTACCC TGGAACGCCT CCTGGAGCAA CAAGAGCCTG 1860 
GACGACATCT GGAACAACAT GACCTGGATG CAGTGGGAGC GCGAGATCGA TAACTACACC 1920 
AGCCTGATCT ACAGCCTGCT GGAGAAGAGC CAGACCCAGC AGGAGAAGAA CGAGCAGGAG 1980 
CTGCTGGAGC TGGACAAGTG GGCGAGCCTG TGGAACTGGT TCGACATCAC CAACTGGCTG 2040 
TGGTACATCA AAATCTTCAT CATGATTGTG GGCGGCCTGG TGGGCCTCCG CATCGTGTTC 2100 
GCCGTGCTGA GCATCGTGAA CCGCGTGCGC CAGGGCTACA GCCCCCTGAG CCTCCAGACC 2160 
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CGGCCCCCCG TGCCGCGCGG GCCCGACCGC CCCGAGGGCA TCGAGGAGGA GGGCGGCGAG 2220 
CGCGACCGCG ACACCAGCGG CAGGCTCGTG CACGGCTTCC TGGCGATCAT CTGGGTCGAC 2280 
CTCCGCAGCC TGTTCCTGTT CAGCTACCAC CACCGCGACC TGCTGCTGAT CGCCGCCCGC 2340 
ATCGTGGAAC TCCTAGGCCG CCGCGGCTGG GAGGTGCTGA AGTACTGGTG GAACCTCCTG 2400 
C^GTATTGGA GCCAGGAGCT GAAGTCCAGC GCCGTGAGCC TGCTGAACGC CACCGCCATC 2460 
GCCGTGGCCG AGGGCACCGA CCGCGTGATC GAGGTGCTCC AGAGGGCCGG GAGGGCGATC 2520 
CTGCACATCC CCACCCGCAT CCGCCAGGGG CTCGAGAGGG CGCTGCTGTA A 2571 



8 



WO 01/79518 



PCT/GB01/01784 



SEQ ID No. 5 - pESYNGP 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTAGATTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTC ATT AGTTCATAGC CCATATATGGAGTTCCGCGTTACATAAC T TACGGTAAATGGCC C 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGC C AATAGGGACTTTC C ATTGAC GT GAATGGGTGGAGTATTTACGGTAAAC TGC 

CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGG CGTGG ATAGCGGTTTGACTCACGGGG ATTTCCAAGTCT C CACC CCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 

AG GTGTCC ACT C CCAGTT CAATTACAGCTC TTAAGGC TAGAGTACTTAATACGACTCACT - 

ATAGG CTAGAGAATTCGCCACCATGGGCGATCCCCTC ACC TGGTCCAAAGC CCTGAAGAA 

ACTGGAAAAAGTCAC CGTTCAGGGTAGC CAAAAGCTTAC CAC AGG C AATTGCAACTGGGC 

ATTGTC CC TGGTGGATCTTTT CC ACGACAC TAATTTC GTTAAGGAGAAAGATTGG CAACT 

CAGAGACGTGATCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGA 

AGCTTTCGAGCGC^CGTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAA 

CGTGGTTGACGGTAAAGCT AG CTTTC AACTGC TC CGCGC TAAGTACGAGAAGAAAAC CG C 

C AAC AAGAAACAAT C CGAACCTAG CGAGGAGTAC CCAATTATGATCGACGG CG CCGGC AA 

TAGGAACTT C CGCCCACTGACTC CC AGGGGCTATACCAC CTGGGTCAACAC CATCCAGAC 

AAACGGACTTTTGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCAC 

CTCCGAAGAAATGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGAT 

CCTGCTCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAA 

CGCCCCTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGG 

ACTGGGGGTGCCCCGCGAACGCCAGATGGAGCCAGCATTTGACCAATTTAGGCAGACCTA 

CAGACAGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAA 

GGCACAGAACATCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCT 

GTCC C AG ATTAAATCCGAAGGCCAC C CT C AGGAGAT C TC CAAGTTCTTG ACAGACACACT 

GACTATCCAAAATGCAAATGAAGAGTGC AG AAACGC CATG AGG CAC C TC AGACCTGAAGA 

TAC C CT GGAGGAG AAAATGTACGCATGTCGCGACATTGG CACTAC C AAG CAAAAGATGAT 

GCTGCTCGCCAAGGCTCTGCAT^ACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGM 

GGGAGGTCCATTGAAAGCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTATC 

TAGTCAATGTAGAGCAC C TAAAGTCTGTTT TAAATGTAAACAGC CTGGACATTTCTC AAA 

GC AATG CAGAAG TGTTC C AAAAAACGGGAAG CAAGGGG CT CAAGGGAGG CC C C AGAAAC A 

AACTTT C CCGATACAAC AGAAGAGTC AG C ACAACAAATC TGTTGTACAAG AGACTC CT CA 

GACTC AAAATCTGT AC C C AGATCTGAGCGAAAT AAAAAAG GAAT AC AATGTC AAGGAGAA 

GGATC AAGTAGAGGATCT CAAC CTGGACAGTTTGTGGGAGTAAC ATACAATCTCGAGAAG 

AGGCCCACTACCATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGA 

G C C G ACAC CAGCGTT CTCACTACTGCTC ACTATAAC AGACTGAAATACAGAG GAAGGAAA 

TACCAGGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGT^AACCTTTTCCACTCCTGTC 

ACCATCAAAAAGAAGGGGAGACACATTAAAACCAGAATGCTGGTCGC CGACATCC CCGTC 

ACCATCCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTG 

T CTAAGGAAATCAAGTT C CGCAAGATCGAG CTGAAAGAGGG C AC AATGGGTC C AAAAATC 

CCCCAGTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTG 

CTTTCTGAGGG CAAGATTAG CGAGGC CAGC GACAATAACCC TTACAACAGCCCCATCTTT 

GTGATTAAGAAAAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAG 

ACCGTCCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAA 

TGCAAGCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCT 

G AATTTCGCC C C TATAC TGCTTTT AC C ATC C C C AGCATCAATCAC C AGG AGCC CGATAAA 

CGCTATGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAG 
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ACACTT CAAGAGATC C TCCAAC CTTTC CGCGAAAGATACCCAGAGGTTCAACTCTACCAA 
TATATGGACG AC CTGTTCATGG GGTC CAACGGGT CT AAGAAGCAGCACAAGGAACTCATC 
ATCGAACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAA 
GAAGTTCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAG 
AAGATGCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGC 
AATATTACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACT 
ACAAAAGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAA 
GTGGAGGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAA 
GAAGAAATGTTGTG CGAGGTCGAAAT C ACTAAGAACTACGAAGC CAC CTATGTC ATCAAA 
C AGT C C C AAGGCAT CT TGTGGGCCGGAAAGAAAATC ATGAAGGC CAACAAAGG CTGGTCC 
AC CGTT AAAAATCTGATGCT CCTGCT C CAGCACGT CGC C AC CGAG T C TATC AC C CG CGTC 
GG C AAG TGCC C CAC C TTCAAAGTT C C CTTCACTAAGGAGC AGGTGATGTGGGAGATGCAA 
AAAGGCTGGTACTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGAC 
GAC TGGAGAATGAAGC TTGTCGAGGAGC C CACTAGCGGAATTAC AATCTATACCGACGG C 
GGAAAG CAAAACGGAG AGGGAAT CGCTG C ATACGTC AC AT CTAACGGCCGCACC AAGC AA 
AAGAGGCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTT 
GAGGACACTAGAGACAAGCAGGTGAACATTGTGACTGACAGCTACTACTGCTGGAAAAAC 
ATC AC AG AGGG C CTTGGCCTGGAGGGAC CC CAGTCTC C CTGGTGGC CTATC ATCC AGAAT 
ATC CG CGAAAAGGAAATTGT CTATTTCGCC TGGG TGC CTGGAC ACAAAGGAATTTAC GGC 
AAC CAACTCG C CGATG AAGC CG CC AAAATTAAAGAGGAAATCATG CTTGCCTACCAGGGC 
AC AC AGATTAAGGAGAAGAGAGAC GAGGACGCTGGCTTTGAC CTGTGTGTGCCATACGAC 
AT CATG ATTC C CGTTAGCGAC AC AAAGATCATTCCAAC CGATGT CAAGATC CAGGTGCCA 
C CCAATT CATTTGGTTGGGTGAC C GGAAAGTCCAGCATGGC TAAGCAGGGTCTTCTGATT 
AACGGGGGAATCATTGATGAAGGATACACCGGCGAAATCCAGGTGATCTGCACAAATATC 
GGC AAAAGC AATATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACT CATCATC CT CCAG 
CAC C AC AGCAATTCAAGAC AAC C TTGGG ACGAAAACAAGATTAG CCAGAGAGGTGACAAG 
GG CT TCGGC AG CACAGGTGTGTTCTGGGTGGAGAAC AT C CAGGAAGCACAGGAC GAGCAC 
GAGAATTGGCACAC CTC CC CTAAGATTTTGGC CCGCAATTAC AAGATC CC ACTGAC TGTG 
G CTAAGCAGAT CACACAGGAATGC C CC C AC TGCACCAAACAAGGTTCTGGC CCCGC CGGC 
TGCGTGATGAGGTC CCCCAAT CACTGGC AGGCAGATTG CAC C CACCTCG ACAACAAAATT 
ATCCTGACCTTCGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAA 
AATGCATTGTGCACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCC 
CTGCACACCGACAACGGCACCAACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGTTC 
CTGAAAATCGCCCACACCACTGGCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAG 
AGGGCCAACAGAACTCTGAAAGAAAAGATCCAATCTCACAGAGACAATACACAGACATTG 
GAGGCCGCACTTCAG C TCG CCCTTATCACCTGCAACAAAGGAAGAG AAAGC ATGGG CG GC 
CAGACCCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTC 
TTGC AG C AGGC C C AGT C CT C CAAAAAGTT CT GCTTTTATAAGAT CC C CGGTGAGCACGAC 
TGGAAAGGTC C TACAAGAGTTTTGTGGAAAGGAG ACGG CG CAGTTGTGGTGAACGATGAG 
GGCAAGGGGATCATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGAf 
AC C CGGGG CGGC C G CTT C CCTTTAGTG AGG GTTAATGC TT CGAG C AG AC ATGATAAGATA 
CATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGA 
AATTTGTGATG C TATTGCTTTATTTGTAAC CATTATAAG CTGCAATAAAC AAGTTAACAA 
CAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAG 
CAAGTAAAACCTCTACAAATGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTAAT 
AGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGG 
ACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCG 
CTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCA 
CGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTA 
GAGCTTT AC GGCACCTCGAC CGC AAAAAACTTGAT TTGGG TGATGGTTCACGTAGTG GGC 
CATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTG 
GACT CTTG TTCC AAAC TGGAACAAC ACT C AAC C CTATC TCGGTC TATT CTTTTGAT TTAT 
AAGGGATTTTGC CGAT TTCGGCCT AT TGGTTAAAAAATGAGCTGATTTAACAAATATTTA 
ACGCGAATTTTAACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTT 
ACGCATCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAA 
T AACCT CTGAAAGAGGAACTTGGTTAGGTAC CTTC TGAGGCG GAAAGAACC AGCTGTGGA 
ATGTGTGTCAGTTAGGGTGTGGAAAGTC C C CAGGCTCC CC AG C AGG CAGAAGTATGCAAA 
GCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCA 
GAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGC 
CCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTT 
TTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAG 
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GAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCT 

CGAAC TTAAGGC TAGAGCCAC CATGATTGAAC AAG ATGGATTGCACG C AGGTTCT C CGG C 

CGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGA 

TGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTGAAGACCGACCT 

GTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGAC 

GGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCT 

ATTGGGCGAAGTGCCGGGGCAGGATCTCGTGTCATCTCACCTTGCTCCTGCCGAGAAAGT 

ATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATT 

CGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGT 

CGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAG 

GCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTT 

GCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGG 

TGTGG C GGAC CGCTAT CAGGACAT AG CGTTGGCTAC C CGTGATATTGCTGAAGAG CTTGG 

CGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCG 

CATCGCGTTCTATCGCCTTCTTGAGGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATG 

AC CGAC CAAGCGACGCC CAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTTTC 

ATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGAT AGCGATAAG GATC CG CGTATGGTG 

CAC TCTC AGTAC AATCTGCTC TGATGCCGCATAGTTAAGC C AGC CC CGACAC CC GC CAAC 

ACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGT 

GACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAG 

ACGAAAGGG C CTCGTGATAC GC CTATTTTT ATAGGTTAATGTCATGATAATAATGGTTT C 

TTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTT 

CTAAATACATTCAAATATGTAT CCGCTCATGAGAC AATAAC CCTGATAAATGC t tgaata 

ATATTGAAAAAGGAAG AGTATGAGTATTCAACATTTC CGTGTCGC CCTTATTC CCTTTTT 

TGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGC 

TGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGAT 

CCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCT 

ATGTGGCGCGGTATTATCCC GTATTGACGCCGGG CAAGAGCAACTCGGTCGC CG CATACA 

CTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGG 

CATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAA 

CTT ACTTCT GAC AACGATCGGAGGAC CGAAGGAG CTAAC CGC TTTTTTGC ACAACATGGG 

GGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGA 

CGAG CGTGACAC C ACGATGC CTGTAGCAATGGCAAC AACGTTGCGCAAACTATTAACTGG 

CGAACTACTTACTCTAGCTTC CCGGCAACAATTAAT AGACTGGATGG AGGCG GATAAAGT 

TGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGG 

AGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTC 

CCGTATCGTAGT TAT C TACACGAC GGGGAGTCAGGCAACTATGGATGAAC GAAATAGAC A 

GATCGCTGAGAT AGGTGC CTCACTGATTAAGCATTGGTAACTGTCAGAC CAAGTTTAC TC 

ATATATACTTTAGATT GATTTT^AAAC TTCATTTTTAATTTAAAAGG ATC TAGGTGAAGAT 

CCTTTTTGATAATCTCATGACCAAAATGCCTTAACGTGAGTTTTCGTTCCACTGAGCGTC 

AGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTG 

CTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCT 

ACC AAC TCTTTTTC CG AAGGTAAC TGG CTTC AG CAGAGCGC AGATAC CAAATAC TGT C CT 

TCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCT 

CGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGG 

GTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTC 

GTGC ACACAGC C CAGCTTGGAGCGAACGACCTAC AC CGAACTG AGATAC CTACAGC GTGA 

GC TATGAGAAAG CGC CACGC TTCCCGAAGGGAGAAAGG C GGAC AGGTAT C CGGTAAGCGG 

CAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTA 

TAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGG 

GGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTG 

C TGG CCTTTTG C TCACATGG CTCGACAGATCT 
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SEQ ID No. & - LpESYNGP 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 
TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCC 
AATATGACCGCCATG TTGG CATTGATTATTGACTAGTTATTAATAGTAAT C AATTACGGG 
GTCATTAGTTCATAG C C CATATATGGAGTTCCGCGT TACATAAC TTACGGTAAATG GCC C 
GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 
AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 
CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 
CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 
GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 
CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGT CTC CAC CCCATTGACGT 
CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 
CGAT CG CCCGC C CCGTTGACGCAAATGGGC G GTAGGC GTGT AGGGTGGGAGGTC TATATA 
AGCAGAGCTCGTTTAGTGAAC CGTCAGAT CACTAGAAGCTTTATTGCGGTAGTTTATCAC 
AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGT 
GACTCTCT TAAGGTAGC C T TG CAGAAGTTGGTCGTGAGGCACTGGG C AGGTAAGTATC AA 
GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 
CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 
AGGTGTC CACTCCC AGTTCAAT TACAGCTCTTAAGGC TAGAGTACTT AATACGACTCACT 
ATAGGC TAGAGAATTCGAGAGGGGCG CAGACCCTACCTGTTGAAC CTGGCTGATCGTAGG 
ATC C C CGG GACAGC AGAGGAGAACTTAC AGAAGTCTTCTGGAGGTGTTC CTGGC CAGAAC 
ACAGGAGGACAGGTAAGATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGG 
AAAAAGTCACCGTTCAGGGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGGGCATTGT 
C CCTG GTGGATCTTTT C CACGACACTAATTT CGTTAAGGAGAAAGATTGGC AACTCAGAG 
ACGTGAT CC C CCTCTTGGAGGACGTGACC CAAAC ATTGTCTGGGC AGGAG CG CGAAGCTT 
TCGAGCGCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGG 
TTGACGGTAAAGCTAG CTTTCAACTGCTC CG CGCTAAGTACGAGAAGAAAACCGCC AACA 
AGAAACAATC C GAAC CTAGCGAGGAGTAC C C AATTATGATCGACGG CGCCGGC AATAGG A 
^® ^^^'^'^■^•^ ^ ® ® ^CTATAC CAC CTGGGT CAAC AC C ATCC AGAC AAACG 

GACTTTTGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCG 
AAG AAATGAATGCTTTTCTCGACGTGGTGC CAGGACAGG CTGGAC AG AAACAGAT CCTGC 
TCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCJ^AACGCCC 

CTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGG 
GGGTGC CCCGCGAACG C CAGATGGAGCCAG CATTTGAC CAATTTAGGC AGAC C TACAGAC 

AGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCAC 
AGAACATCAGGCAGGGGGCC AAG GAAC CATACCCTGAGTTTGTCGACAGGCTTCTGTCCC 
AG ATTAAATCCGAAGGC CAC C CTCAGGAGAT CTCCAAGTTCTTGACAGACACACTGACTA 
TCCAAAATGCAAATGAAGAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAAGATACCC 
TGGAGGAGAAAATGTACGCATGTCGCGACATTGGCACTACCAAGCAAAAGATGATGCTGC 
™ « ^ CAAGGCT CTG CAAACCGGCC TGGC TGGTCCATTCAAAGGAGGAG C ACTGAAGGGAG 
GTCCATTGAAAGCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTATCTAGTC 
AATGTAGAGCAC CTAAAGTC TGT TTTAAATGTAAAC AGCC TGGACAT TTCTCAAAGCAAT 
G CAGAAGTGTTCC AAAAAACGGGAAG C AAGGGGCTCAAGGGAGG CCC CAGAAAC AAACTT 
TCCCGATACAACAGAAGAGTCAGCACAACAAATCTGTTGTACAAGAGACTCCTCAGACTC 
AAAAT CTGTACC C AGAT CTGAG CGAAATAAAAAAGGAATACAATGTCAAGGAGAAGG ATC 
r^™» tv^^ G ^^ TC ^^^^ ^^^^^^ A ^ T,r TGTGGGAGTAAC ATACAATCTCGAGAAGAGGC C 
CACTACCATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGAGCCGA 
CAC CAGCGTTCT CACTACTG CTC AC T ATAACAG ACTGAAATACAGAGGAAGG AAATACCA 
GGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACCAT 
CAAAAAGAAGGGGAGACACATTAAAACCAGAATGCTGGTCGCCG ACATCC CCGTCAC CAT 
CCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAA 
GGAAAT CAAGTTC CG CAAGAT CGAG C TGAAAG AGGGC ACAATGGGT CC AAAAATC CC C CA 

GTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTGCTTTC 
TGAGGGCAAGATTAGCGAGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGAT 
*™ G ^ GGAGOT ^ 

CCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAA 
GCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCTGAATT 
TCGC CC CTATACTG CTTTTACC AT CC C CAGCATC AATCAC C AGGAG C C CGATAAACGCTA 

TGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAGACACT 
TCAAGAGATC CTC CAAC C TTT C CGCGAAAGAT AC C CAGAGGTT CAACTCTACC AATATAT 
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GGACGACCTGTTCATGGGGTCCAACGGQTCTAAGAAGCAGCACAAGGAACTCATCATCGA 
ACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGT 
TCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAGAAGAT 
GC AGTTGGATATGGTCAAGAAC C CAAC ACTGAACGACGTC CAGAAGCTCATGGGCAAT AT 
TACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAA 
AGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAACTGGA 

ggagaataatgaaaagattaagaatgctcaagggctccaatactacaatcccgaagaaga" 

AATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTC 

CCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGT 

TAAAAATCTGATG CTCC TG CTCCAGCACGTCGC CACCGAGTCT ATCACC CGCGTCGG CAA 

GTGCCCCACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGG 

CTGGTACTACTCTTGGCTTC C CGAGAT CGTCTACACC C AC C AAGTGGTGC ACGACGACTG 

GAGAATGAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACGGACGGCGGAAA 

GCAAAACGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGCAAAAGAG 

GCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGA 

C ACTAGAGACAAG CAGGTGAAC ATTGTG ACTGAC AGC TACTAC TGC TGGAAAAACATCAC 

AGAGGG CCTTGGC CTGGAGGGAC CC CAGT CTC CCTGGTGG C CTATC ATC C AG AAT ATCCG 

CGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGCAACCA 

ACTCG C CGATGAAGCCGCC AAAAT TAAAGAGGAAATCATGCTTG CCTACCAGGGCACACA 

GATTAAGGAGAAGAG AGACGAGGACG CTGGCTTTGACC TGTGTGTGC C ATACGACATCAT 

GATTC CCGTTAG CGACAC AAAGAT CATTCCAACCGATGTCAAGATCCAG GTGCCACC CAA 

TTCATTTGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATTAACGG 

GGGAAT CATTGATGAAGGATAC AC CGGCGAAATCCAGGTGATCTGCACAAATATCGG CAA 

AAG CAATATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAAC T CATCATC C TCCAGCACCA 

CAGCAATTCAAGACAACCTTGGGACGAAAACAAGATTAGC CAGAGAGGTGACAAGGG CTT 

CGGCAG CAC AGGTGTGTTCTGGGTGGAGAACAT C CAGGAAG CACAGGAC GAGCACGAGAA 

TTGGCACACCTCC CCTAAGATTTTGGCCCGCAATTACAAGATC CCACTGACTGTGGC TAA 

GCAGATCACACAGGAATGCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCGGCTGCGT 

GATGAGGTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGACAACAAAATTATCCT 

GACCTTCGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGC 

ATTGTGCACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCCCTGCA 

CACCGACAACGG CACC AACTT TGTGG CTGAACCTGTGGTGAATCTGCTGAAGTTCCT GAA 

AATCGCCCACACCACTGGCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAGAGGGC 

C AACAGAACT C TGAAAGAAAAGATC CAATCTC AC AGAGAC AATACACAGACATTGG AGGC 

CGC AC TTC AGCT CGCCCTTATC AC CTGC AACAAAGGAAGAGAAAGCATGGGCGGC CAGAC 

CCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTCTTGCA 

GCAGGCCCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGACTGGAA 

AGGTCCTACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATGAGGGCAA 

GGGGATCATCGCTGTG CC C CTGACACG CACC AAGCTTC TCAT CAAG C CAAAC TG AAC C CG 

GGGCGG C CGCTTCCC T TT AGTGAGGGTTAATGCTTCG AGCAGACATGAT AAGATAC ATTG 

ATGAGTTTGGACAAAC CACAACTAGAATG CAGTGAAAAAAATGC T TTATTTGTGAAATTT 

GTG ATG C T ATTGC TTTATTTGTAAC C ATTATAAGCTGCAATAAACAAGTTAACAAC AAC A 

ATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGT 

AAAAC CTCTACAAATGTGGTAAAATC CGATAAGGATCG ATC CGGG CTGG CGTAATAGCG A 

AGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCG 

CCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACA 

CTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTC 

GCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCT 

TTACGG CAC CTCGAC CG CAAAAAAC TTG ATTTG GGTGATGGTT CACGTAGTGGG C C AT CG 

CCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTC 

TTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGG 

ATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATATTTAACGCG 

AATTTTAACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCA 

TCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACC 

TCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTG 

TGTC AGTTAGGGTGTGGAAAGTC C CCAGGCTCCC CAGCAGGCAGAAGTATGC AAAG CATG 

CATCTCAATTAGTCAGCAACCAGGTG TGGAAAGTC CC CAGG CTC C CCAGC AGGCAGAAGT 

ATGCAAAGCATGCATCTCAATTAGT CAG CAAC C AT AGT C C CGC CC CT AACTCCGC C CATC 

CCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTT 

ATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGC 

TTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCTCGAAC 
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TTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTT 
GGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCG 
CCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCG 
GTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCG 
TTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGG 
GCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCA 
TCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACC 
ACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATC 
AGGATGAT C TGGACGAAGAGC ATCAGGGGCTCGCG CCAGCCGAACTGTT CGC C AGGCTCA 
AGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGA 
ATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGG 
CGGACCGCTATCAGGACATAGCGTTGGCTAC CCGT GATATTGCTGAAGAGCTTGGCGGCG 
AATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGGGCATCG 
CCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGA 
CCAAGCGACGCCCAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTTTCATTAC 
ATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCG AT AAGGATC CGC GT ATGGTGC ACTC 
TCAGTAC AAT CTGCT CTGATGCCG C ATAGTT AAGCC AGCCC CGACACC CGCCAAC ACCCG 
CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCG 
TCT C CGGGAGCTG CATGTGTCAGAGGTTTTC AC CGTCATCAC CGAAAC GCGCGAGAC GAA 
AG GGC CTCGTGAT ACGC CT ATTTTT AT AGGTTAATGTC ATGAT AATAATGGTTTCTTAGA 
CGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAA 
TACATTCAAATATGTAT C CGCTCATGAGACAATAAC CCTGATAAATGCTT CAATAATATT 
GAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGG 
CATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAG 
ATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTG 
AGAGTTTT CGCC CC GAAGAACGTTTT C CAATGATGAGCAC TTTTAAAGTT CTGCTATGTG 
GCG CGGTATTAT CC CGTATTGACG CCGGGCAAG AG CAACTCGGTCG C CG CAT ACACTATT 
CTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGA 
CAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTAC 
TTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATC 
ATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGC 
GTGAC AC CACGATGC CTGTAGCAATGGCAACAACGTTG C G C AAACTATTAACTGGCGAAC 
TACTTACTC TAG CTTCCCGGCAAC AATTAATAGACTGGATGGAGGCGG ATAAAGTTGCAG 
GACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCG 
GTGAG CGTGGGT CT CGCGGTATCATTG CAG CACTGGGG C CAGATGGTAAGC CCTCC CGTA 
TCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCG 
CTG AGAT AGGTG C CTC ACTG ATT AAG C ATTGGTAACTGT CAG AC C AAGTTT ACTC ATATA 
TACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTT 
TTGATAAT CTC AT G AC C AAAATC C CTT AACGTGAGTTTTCGTTC CAC TGAG CGTCAGAC C 
CCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCT 
TGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAA 
CTCTTTTTCCGAAG GTAACTGGCTTCAGC AGAGCGCAGATAC CAAATAC TGTCCTTCTAG 
TGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTC 
TGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGG 
ACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCA 
CACAG CC CAGCTTGG AGCGAACGAC CT ACAC CGAACTGAGATAC CTAC AG C GTGAGCT AT 
GAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGG 
TCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTC 
CTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGC 
GGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGC 
CTTTTGCTCACATGGCTCGACAGATCT 
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SEQ id No. 7 - pESYNGPRRE 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 
TTGGC C ATTGC AT ACGTTGTATCTAT ATCATAATATGTACATT TATATTGGCT CATGTC C 
AATATG AC CGC C ATGTTGGCATTGATT ATTGACT AGTTATTAATAGTAATCAATTAGGGG 
GTC ATTAGTTCATAGC C CAT ATATGGAGTTC CGCGTTAC ATAACTTACGGTAAATGGC C C 
GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 
AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 
CCAC TTGGCAGTAC AT C AAGTGTAT C ATATG CC AAGT CCGCCCCC TATTGACGT CAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 
GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 
CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 
CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 
CGATCGCC CGC C C CGTTGACGCAAATGGGCGGTAGG CGTG TACGGTGGG AGGTC TATATA 
AGCAGAGC TCGTTTAGTGAACCGTC AG ATCAC TAGAAGC T TTATTGCGGTAGTTTATC AC 
AGTTAAAT TGC TAACGCAGTC AGTGCT TCTGACAC AACAGT CT CG AACTTAAG CTG C AGT 
GACTCTC TTAAGGTAGCCTTG CAGAAGTTGGTCGTGAGGCACTGGG CAGGTAAGTATC AA 
GGTTACAAGAC AGGTTTAAGGAGAC CAATAGAAACTGGGC TTGT CGAGACAGAGAAGACT 
CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 
AGGTGTC C AC TCC C AGTTC AATTACAG CTCTTAAGGCTAGAGTACTTAATACGACTCACT 
ATAGGCTAGAGAATTCGCCACCATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAA 
ACTGGAAAAAGT CACCGTT CAGGGTAG C CAAAAGC TTACCAC AGG CAATTGC AACTGGGC 
ATTGTC CCTGGTGGAT CTTTTCCACGAC ACTAATT TCGTTAAG GAGAAAGATTGG CAACT 
CAGAGACGTGATCCCCCTCTTGGAGGACGTGACCCAAACATTGTC TG GGCAGGAGCGCGA 
AGCTTTCGAGCGCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAA 
CGTGGTTGACGGTAAAGCTAGCTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGC 
CAACAAGAAAC AAT C CGAAC CTAG CGAGGAGTAC CC AATTATGAT CG ACGGCGC CGGCAA 
TAGG AAC TTCCG C C CACTGACTC C CAGGGG CTAT AC CAC CTGGG TCAACACCATC C AGAC 
AAAC GGACTTTTGAAC GAAGC CT C C CAGAACCTGTT CGGCATCCTGT CTGTGG ACTGCAC 
CTCCGAAGAAATGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGAT 
CC TGCTCGATG CCATTGACAAGATCGCCGACGAC TGGGATAATCGC CAC C CCCTGC CAAA 
CGCCCCTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGG 
ACTGGGGGTG C CC CGCGAACGCCAGATGGAG CC AGCATTTGACCAATTTAGG CAGACCTA 
CAGAC AGTGGAT C ATCGAAGCCATGAGCGAGGGGATTAAAGTC ATGATCGG AAAGC C C AA 
GGCACAGAACAT CAGGC AGGGGGCC AAGGAAC CATACC CTGAGTTTGTCGACAGGCTTCT 
GTC C CAGATTAAATC CGAAGG CCAC CCT C AGGAGATCT C CAAGT TCTTGAC AGACACACT 
GACTATC CAAAATGCAAATGAAG AGTGC AGAAACG C CATG AGGC AC C TCAGACC TGAAGA 
TAC C C TGG AGGAGAAAATGTACGCATGTCGCGACAT TGGC ACT AC C AAGCAAAAGATGAT 
GCTGCTCGCCAAGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAA 
G GGAGGT C CATTGAAAGCTGCACAAAC ATGTTATAATTGTGGGAAGC CAGGACATTTATC 
TAGTCAATGTAGAGCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCAAA 
GCAATGCAGAAGTGTTC CAAAAAACGGGAAGCAAGGGG C TCAAGGGAGG CC C CAGAAACA 
AACTTTC CCGAT ACAACAGAAGAGTCAGCACAAC AAATCTGTTGTACAAG AGACTC C TCA 
GACT C AAAAT C TGTAC C CAGAT CTG AG CGAAATAAAAAAGGAATAC AATGTCAAGGAGAA 
GGAT C AAGTAGAGGATC TC AAC C TGGAC AGTTTGTGGGAGT AACATACAAT C T CG AGAAG 
AGGC C CACTAC CATCGTC C TGAT CAATGAC AC C CCTCT TAATGTG CTGCTG G AC AC CGGA 
G C C GACACC AG CGTTCT CACTAC TGCTCAC TAT AACAG ACTGAAAT AC AG AGGAAGGAAA 
TACCAGGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTC 
AC CATCAAAAAGAAGGG GAGACAC ATT AAAAC CAGAATGCTGGTCGCCGACAT C C C CGT C 

ACCATCCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTG 
TCTAAGGAAATCAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATC 
CCCCAGTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTG 
CTTT CTG AGGGCAAGATTAGCGAGGCC AGCGACAAT AAC C CTTACAACAG C CC CAT CTTT 
GTGATTAAG AAAAGGAG CGGCAAATG G AGACTC CTG CAGGACCTGAGGGAACTCAAC AAG 
ACCGTCCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAA 
TGCAAGCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCT 
GAATTTCGCCCCTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAA 
CGCTATGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAG 
AC AC TT CAAG AGAT CC T C CAAC CTTT C CGCGAAAGATAC CCAG AGGTT CAACTCTAC C AA 
TATATGGACGAC CTGTTCATGGGGTC CAACGGGTCTAAGAAGCAGCACAAGGAACTCATC 
ATCGAACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAA 
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GAAGTT CC T GCATATAG CTGG C TGGGCTAC CAGCTTTGCCCTGAAAACTGGAAAGT CC AG 

AAGATG CAGTTGGATATGGTCAAGAAC C C AAC ACTGAACG ACGT C CAGAAGCT CATGGGC 

AATATTACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACT 

ACAAAAGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAA 

CTGGAGGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAA 

GAAGAAATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAA 

CAGTCCCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCC 

ACCGTTAAAAATCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTC 

GGCAAGTGCCCCACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAA 

AAAGGCTGGTACTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGAC 

GACTGGAGAATGAAGCTTGTCGAGGAG C CCACTAGCGGAATTAC AATCTAT AC CGACGG C 

GGAAAGCAAAAC GGAG AGGGAATCGCTGCATACGTC ACATCTAACGG C CGCAC CAAGCAA 

AAGAGGCT C GG C C CTGTCACTCAC CAGGTGGCTGAGAGGATGGCT ATCC AGATGGCG CTT 

GAGGACACTAGAGACAAGCAGGTGAACATTGTGACTGACAGCTACTACTGCTGGAAAAAC 

ATCACAGAGGGCCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAAT 

ATCCGCGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGC 

AACCAACTCGCCGATGAAGCCGCCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGC 

AC ACAGATTAAGGAGAAGAGAGACGAGGACGC TGGCTTTG AC CTGTGTG TG CCAT ACG AC 

AT CATGATTC CCGTT AGCGACAC AAAGAT CATTCC AAC CGATGTCAAGATC CAGGTGCC A 

CCCAATTCATTTGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATT 

AACGGGGGAATCATTGATGAAGGATACACCGGCGAAATCCAGGTGATCTGCACAAATATC 

GG CAAAAGCAATATT AAGCTTATCGAAGGGCAGAAGTT CGCT C AACT CATCATC CTCC AG 

C AC CACAGC AATT CAAGACAAC CTTGGGAC GAAAAC AAGATTAGC CAGAGAGGTGAC AAG 

GGCTTCGG CAGCACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGG ACGAG C AC 

GAGAATTGGCACACCTCCCCTAAGATTTTGGCCCGCAATTACAAGATCCCACTGACTGTG 

GCTAAGCAGAT CACACAGGAATGC C C C CACTGCAC CAAACAAGGTTCTGGCCC CGCCGGC 

TGCGTGATGAGGTCC C CCAATCAC TGGCAGGCAGATTGCAC C CACCTCGACAACAAAATT 

ATCCTGACCTTCGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAA 

AATGCATTGTGCACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCC 

CTGCACACCGACAACGGC AC C AACTTTGTGG CTGAACCTGT GGTGAATCTGCT G AAGTTC 

CTGAAAATCGCCCACACCACTGGCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAG 

AGGGCCAACAGAACTCTGAAAGAAAAGATCCAATCTCACAGAGACAATACACAGACATTG 

GAGGCCGCACTTCAGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGC 

CAGACCCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTC 

TTGCAG CAGGC C CAGT CC TCCAAAAAGTTCTGCT TTTATAAGATCCC CGGTGAGCACGAC 

TGGAAAGGTC CT AC AAG AGTTTTGTGGAAAGGAGACGGCG CAGTTGTGGTGAACGATGAG 

GGCAAGGGGATCATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGA 

ACCCG ACGAATC C C AGGGGGAATCT CAAC CC C TATTACC CAAC AGTC AGAAAAATC T AAG 

TGTGAGGAGAACACAATGTTTCAACCTTATTGTTATAATAATGACAGTAAGAACAGCATG 

GCAGAATCGAAGGAAG CAAGAGAC C AAGAAATGAACCTGAAAGAAG AATC TAAAGAAGAA 

AAAAGAAGAAATGACTGGTGGAAAATAGGTATGTTTCTGTTATGCTTAGCCAGGGCCCTC 

TGGAAGGTGAC CAGTGGT GC AGGGT C CT C CGGC AGTCGT TAC CTG AAG AAAAAATTC CAT 

CAC AAAC ATGC ATCGCGAGAAGACAC CTGGGAC CAGGCC CAACACAACATACAC C TAGCA 

GGCGTGACCGGTGGAT CAGGGGACAAATACTACAAGCAGAAGTACT C C AGGAACGACTGG 

AATGG AGAAT C AGAGGAGTACT^ACAGGCGGC CAAAGAGCTG GGTGAAGTCAATCGAGGC A 

TTTGGAGAGAGC TATATTTC CGAGAAGAC C AAAGGGGAGATTTCT C AG CCTGGGGCGGCT 

ATCAACGAG CACAAGAACGGCTCTGGGGGGAACAATCC TCAC C AAGGGTC C TT AGAC CTG 

GAGATTCGAAGCGAAGGAGGAAACATTTATGACTGTTGCATTAAAGCCCAAGAAGGAACT 

CTCGCTATCCCTTGCTGTGGATTTCCCTTATGGCTATTTTGGGGGTCGGGGCGGCCGCTT 

CCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGTTTGGAC 

AAAC CACAACTAGAATGCAGTGAAAAAT^ATG C T TTATTTGTGAAATTTGTGATG CTATTG 

CTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATT 

TTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACA 

AATGT GGTAAAAT CCGATAAGGATC GAT C CGGGCTGGCGTAATAG CGAAGAGGC C CG CAC 

CGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGC 

GCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCC 

C TAG CGCCCG C TC CTTTCGCTTTCTTCCCTTCCTTTCTCGC CACGTT CG C CGG C TTTCCC 

CGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGGCACCTC 

GACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACG 

GTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACT 

GGAAC AACACTC AAC C CT AT CTCGGTC TATT CTTTTGATTTATAAGGG ATTTTG C CGATT 
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TCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATATTTAACGCGAATTTTAACAAA 
ATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTAT 
TTCACACCG CATACGCGGAT CTGCGCAG CAC C ATGGC CTGAAATAAC CTCTGAAAG AGGA 
ACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGG 
TGTGGAAAGTCCCCAGGCTCCCCAGGAGGCAGAAGTATGCAAAGCATGCATCTCAATTAG 
TCAGCAACCAGGTGTGGAAAGTCCCCAGGCTC C C CAGC AGGCAGAAGTATGCAAAGCATG 
CATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACT 

GCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGC 

CTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCTCGAACTTAAGGCTAGAG 

CCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGC 

TATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGC 

TGTCAGCGCAGGGGCGGCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATG 

AACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAG 

CTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGG 

GGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATG 

CAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAAC 

ATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGG 

ACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGC 

CCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGG 

AAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATC 

AGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACC 

GCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCC 

TTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCC 

CAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTTTCATTACATCTGTGTGTTG 

GTTTTTTGTGTGAATCGATAG CGATAAGGATC CG CGTATGGTGCACTC TC AGTACAATCT 

GCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCT 

GACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCT 

GCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGA 

TACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCA 

CTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATA 

TGTATCCG CTCATGAGAGAATAAC C CTGATAAATGCTT CAATAAT ATTGAAAAAGGAAGA 

GTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTG 

CTGTTTTTGCTCAC C CAGAAACG CTGGTGAAAGTAAAAGATGCTGAAGAT CAGTTGGGTG 

CACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTtTTCGCC 

C CGAAGAACGTT TTC C AATGATGAGC AC TTTTAAAGTTC TGCTATGTGGCG CGGTATT AT 

CCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGGATAGACTATTCTCAGAATGACT 

TGGTTGAGTACT C AC CAGTCACAGAAAAGCATCTTACGG ATGGC ATGACAGTAAGAGAAT 

TATGCAGTG CTG C CATAAC CATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACG A 

TC GGAGGAC CGAAGGAG CTAAC CG CTTTTTTGC ACAACATGGGGGATCATGTAACTCGCC 

TTGATCGTTGGGAAC C GGAG CTGAATGAAGCC ATACCAi\ACGACGAGCGTGACACC ACGA 

TGC C TGTAGCAATGGC AACAACGTTG CGC AAAC TATT AACTGGCGAACTACTT ACT CT AG 

CTTC C CGG CAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGG AC CACTT CTGC 

GCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGT 

CTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCT 

AC ACGAC GGGGAGTCAGGCAACTATGGATGAACGAAATAGAC AGATCGCTGAGATAG GTG 

C C TC ACTGATTAAGCAT TGGTAAC TGT CAGAC CAAGTTTAC T CATATATAC TTTAGATTG 

ATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCA 

TGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGA 

TCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAA 

AACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGA 

AGGTAACTGGCT TCAG CAGAG CGC AG ATAC C AAATACTGTCC TTCTAGTGTAGC CGTAGT 

TAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGT 

TACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGAT 

AGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT 

TGGAG C GT^ACG AC CTACAC C GAACTGAGATAC C TACAGCGTG AGCTATGAG AAAGC GCCA 

CGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAG 

AGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTC 

GCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGA 

A7\AACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACA 

TGGC T CGACAGAT CT 
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SEQ ID No. % - LpESYNGPRRE 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

C CACTTGG CAGTACATCAAGTGTATCATATG CCAAGT CCGC CC C CTATTG ACGTCAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCT AACGC AGT C AGTGCTT CTGACAC AAC AGTCT CGAACTT AAGCTGC AGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTGTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 

AGGTGT C CACT CC CAGTTC AATTACAG CTCTTAAGG CTAGAGTACTTAATACGACTCACT 

ATAGGCTAGAGAATTCGAGAGGGGCGCAGACCCTACCTGTTGAACCTGGCTGATCGTAGG 

ATCCCCGGGACAGCAGAGGAGAACTTACAGAAGTCTTCTGGAGGTGTTCCTGGCCAGAAC 

ACAGGAGGACAGGTAAGATGGGCGATC CC CTCAC CTGGTCCAAAGC CCTGAAGAAACTGG 

AAAAAGTCACCGTT<^GGGTAGCC^y\AAGCTTACCACAGGCAATTGCAACTGGGCATTGT 

C C C TGGTGGATCTTTTCCACGACACTAATTT CGTTAAGGAGAAAGATTGGC AACTC AGAG 

ACGTGATCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGAAGCTT 

TC(^GCGCACCTGGTGGGCC^TCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGG 

TTGACGGTAAAGCTAGCTTTC AACTG CTC CG CG CTAAGTACGAGAAGAAAACCGCCAAC A 

AGAAACAATCCGAAC CTAGCGAGGAGTAC C CAATTATGATCGACGGCG CCGGCAATAGGA 

AC T TCCGC CCACTGACTC CCAGGGG CTATAC C ACCTGGGT CAACACCATC C AGACAAACG 

GACTTTTGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCG 

AAGAAATGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGATCCTGC 

TCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAACGCCC 

CTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGG 

G GGTGC CCCGCGAACG C CAGATGGAGCC AG C ATTTGACC AATTTAGGCAGACCT ACAGAC 

AG TGGATC ATCGAAGC CATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCC CAAGGC AC 

AGAACAT C AGG C AGGGGGC C AAGGAAC CAT AC C CTGAGTTTGTCGACAGGCTTCTGTC C C 

AGATTAAATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACTGACTA 

T CC AAAATGCAAATGAAGAGTG CAGAAACGCCATGAGG CACCTCAGAC C TGAAGATAC C C 

TGG AGG AG AAAATGTACGC ATGTCGCG AC ATTGGC AC TAC C AAGC AAAAGATGATGCTG C 

TCGCCAAGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAAGGGAG 

GTC C ATTGAAAGCTGCAC AAAC ATGTTATAATTGTGGGAAGC CAGGACATTTAT CTAGTC 

AATGTAGAGCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCAAAGCAAT 

GCAGAAGTGTT CCAAAAAACGGGAAGCAAGGGGC TC AAGGG AGGCCC CAGAAACAAACTT 

T C C CGAT AC AACAGAAGAGT C AG CAC AAC AAAT CTGTTGTAC AAGAGACTCCTCAGACTC 

AAAATCTGTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAAGGATC 

AAGT AGAGGATCTCAAC CTGGAC AGTTTGTGGGAGTAACATACAATC TCGAGAAGAGG CC 

CACT AC CATCGTC CTGATC AATGAC ACCC CT CTT AATGTG CTGCTGGAC ACCGGAG C CGA 

CACCAGCGTTCTCACTACTGCTCACTATAACAGACTGAAATACAGAGGAAGGAAATACCA 

GGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACCAT 

CAAAAAG AAGGGG AG AC AC ATT AAAACC AG AATGCTGGTCGC C G ACATCC CCGTCAC CAT 

CCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAA 

GGAAATC AAGTTC CGC AAGAT CGAGCTGAAAGAGGG CACAATGGG TC CAAAAATCC CCC A 

GTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTGCTTTC 

TGAGGGC AAGATT AGCGAG G CCAGCGACAATAAC CCTTACAACAG CCC CATCTTTGTG AT 

TAAGAAAAGGAGCGGCAAATGGAGACT C CTGCAGGAC C TG AGGGAACT CAACAAGAC CGT 

CCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAA 

GCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCTGAATT 

TCGCCCCTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTA 

TGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAGACACT 

T CAAG AGATC CTCCAAC C TT T C CGCGAAAG ATACCCAGAGGTT C AACTCT AC C AATATAT 
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GGACGACCTGTTCATGGGGTCC^ACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGA 
ACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGT 
T CCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTC CAGAAGAT 
GCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATAT 
TACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAA 
AGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAACTGGA 
GGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAA^ 
AATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTC 
CCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGT 
TAAAAATCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTCGGCAA 
GTGCCCCACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGG 
CTGGTAC T ACTC TTGGC TTC CC GAGATC GT CT ACAC C CAC C AAGTGGTG C ACGACGACTG 
GAGAATGAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAA 
G C AAAACGGAGAGGGAATCG CTGCAT AC GT CACATCTAACGG C CGCACC AAGC AAAAGAG 
GCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGA 
CAC TAGAGAC AAG CAGGTGAACATTGTGACTGACAGCTACT ACTGCTG GAAAAACAT CAC 
AGAGGGCCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCG 
CGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGCAACCA 
ACTCGC CGATGAAG C CG C CAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGCACACA 
GATT AAGGAGAAGAGAGACG AGG ACGCTGG CTTTGAC CTGTGTGTGCCATACGAC ATCAT 
GATTCC CGTTAGCGACACAAAGATCATTCCAACCGATGTCAAGATC C AGGTG CCAC C CAA 
TTCATTTGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATTAACGG 
G GG AATCATTGATGAAGGATACAC CGGCGAAATCCAGGTGAT CTGCACAAATATCGGC AA 
AAGCAATATTAAGCTTAT CGAAGGGCAGAAGTTCGCTCAAC TCATCATCCTCCAGCAC CA 
CAG CAATTC AAGAC AACCTTGGGACGAAAACAAGATTAGC CAGAGAGGTGACAAGGGCTT 
CGGCAGCACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGCACGAGAA 
TTGGCACACCTCCCCTAAGATTTTGGCCCGCAATTACAAGATCCCACTGACTGTGGCTAA 
GCAGATCACACAGGAATGCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCGGCTGCGT 
GATGAG GTCCC C CAATCACTGGCAGGC AGATTGCAC C C AC CTCGACAAC AAAATTATCCT 
GACCTTCGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGC 
ATTGTGCACCTCCCTCGCAATTCTGGAATG GGCCAGGCTG TTCTCTCCAAAATCCCTGCA 
CACCGACAACGGCACCAACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGTTCCTGAA 
AATCGCC C ACAC C ACTGGCATTC C CTATC ACC CTGAAAGC C AGG GC ATTGTCGAGAGGGC 
CAACAGAACTCTGAT^AGAAAAGATCCAATCTCACAGAGACAATACACAGACATTGGAGGC 
CGCACTTCAGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGCCAGAC 
CCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTCTTGCA 
GCAGGCCCAGTCCTCCT^AAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGACTGGAA 
AGGTCCTACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATGAGGGCAA 
GGGGATCATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGAACCCG 
ACG AATC C C AGGGGGAATCTC AACC CCTATT ACC C AAC AGT C AGAAAAATCTAAGTGTGA 
GGAGAACACAATGTTTCAACCTTATTGTTATAATAATGACAGTAAGAACAGCATGGCAGA 
AT CGAAGGAAGCAAG AGAC C AAGAAATGAACC TGAAAGAAGAAT CTAAAGAAGAAAAAAG 
AAGAAATGACTGGTGGAAAATAGGTATGTTTCTGT TATG CTTAGCCAGGG CC CT CTGGAA 
GGTGAC CAGTG GTGC AGGGT CC T C CGGCAGTCGTTACCTGAAGAAAAAATTC CATC AC AA 
AC ATGC AT CGC GAGAAG ACACC TGGGAC C AGGC C CAACACAACAT AC AC CT AG C AGGCGT 
GAC CGGTGGATCAGGGGACAAATACTAC AAG C AGAAGTACT C CAGGAACG AC TGGAATGG 
AGAAT C AGAGGAGTACAACAGG CGGCCAAAGAGC TGGGTGAAGTCAATCGAGGCATTTG G 
AGAGAGCTATATTTCCGAGAAGACCAAAGGGGAGATTTCTCAGCCTGGGGCGGCTATCAA 
CGAGCACAAGAACGGCTCTGGGGGGAACAATCCTCACCAAGGGTCCTTAGACCTGGAGAT 
TCGAAGCGAAGGAGGAAACATTTATGACTGTTGCATTAAAGCCCAAGAAGGAACTCTCGC 
TATCCCTTGCTGTGGATTTCCCTTATGGCTATTTTGGGGGTCGGGGCGGCCGCTTCCCTT 
TAG TG AGGGTTAATGCTTCGAG C AGAC ATGATAAGATAC ATTGATGAGTTTGG ACAAAC C 
ACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTA 
TTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATTTTATG 
TTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACAAATGT 
GGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTAATAGCGAAGAGGCCCGCACCGATC 
GCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGCGCATT 
AAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGC 
GCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCA 
AGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGGCACCTCGACCG 
CAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTT 
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TCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAAC 
AACACTCAACCGTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGC 
CTATTGGTTAAAAAATGAGC TGATTTAACAAATATTTAACG CGAATTTTAACAAAATATT 
AACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCAC 
ACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTG 
GTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGTGTGG 
AAAGTG C CC AGGCTCC CC AGCAGG C AGAAGTATG C AAAGCATG CATC T C AATTAGTC AGC 
AAC C AGGTGTGG AAAGTC CC CAGGC TC C C C AGC AGG CAGAAGTATGCAAAGC ATGC ATCT 
CAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCC 
CAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGA 
GGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGG 
CTTTTG C AAAAAG C TTGATTCTT CTG ACACAACAGTCTCG AACTTAAGGC TAGAGCCACC 
ATGATTGAAC7VAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTC 
GGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCA 
GCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTG 
CAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTG 
CTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAG 
GATCT C C TGTCATC TC ACCTTGCT CCTG C CGAGAAAGTATCCATGATGGCTGATGC AATG 
CGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGC 
ATCGAG CGAGCACGTACTCGGATGGAAGC CGGT CTTGTCGAT C AGGATG ATC TGGACGAA 
GAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGAC 
GGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAAT 
GGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGAC 
ATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTC 
CTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTT 
GACGAGT TCTT CTGAGCGGGACTCTGGGGTTCGAAATGAC C GAC CAAG CGAC GC C CAAC C 
TGCCATCACGATGGCCGCAATAAAATATCTTTATTTTCATTACATCTGTGTGTTGGTTTT 
TTGTGTGAATCGATAGCGATAAGGATCCGCGTATGGTGCACTCTCAGTACAATCTGCTCT 
GATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGG 
GCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATG 
TGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGC 
CTATTT T TATAGGTTAATGTC ATGATAATAATGGTTTC TTAGACGTCAGGTGGCACTTTT 
CGGGGAAATGTG C GC GGAAC C CCTATTTGTTTATTTTTCTAAATACATTCAAATATGTAT 
CCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATG 
AGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTT 
TTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGA 
GTGGGTTAC AT C GAACTGGATCTCAACAG CGGTAAGATCCTTGAGAGT TTT CGCCCCGAA 
GAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGT 
ATTGACG CCGGGCAAGAG C AACTCGGTCG C CGCATAC ACTATTCTCAGAAT GACTTGGTT 
GAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGT7\AGAGAATTATGC 
AGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGA 
GGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGAT 
CGTTGG GAAC C G GAGC TG AATGAAG CCATAC C AAACG ACGAG CGTGACAC CACGATGCC T 
GTAGCAATGG C AAC AACGTTGCGC AAACTATTAACTGGCG AACTACTTAC T CTAG CTTC C 
CGGC AAC AATT AAT AGAC TGGATGGAGGC GGATAAAGTTGCAGGACC ACTT CTG CGCT CG 
GCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGC 
GGT AT CATTG CAGCAC TGGGGC C AGATGGTAAGC C CTC CC GTAT CGTAGTTAT CTAC ACG 
ACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCA 
CTGATT AAGC ATTGGT AAC TGTCAGAC C AAGTTTACTC AT ATAT ACTT T AG ATTGATTT A 
AAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACC 
AAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAA 
GGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCA 
CCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTA 
AC TGGC TTC AG CAGAG CGC AGAT ACCAAATACTGTCCTTCTAGTGTAG C CGT AGTTAGG C 
CAC C AC TTC AAGAAC T CTGTAGCACCGC C T AC AT AC C TCGCTCTGC TAATC CTGTTAC C A 
GTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTA 
CCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAG 
CGAACGAC CTACAC CGAAC TGAGATAC C TAC AG CGTGAGCT ATGAGAAAGCGC C ACGCTT 
C C CGAAGGGAGAAAGG CGGAC AGGT ATC CGGTAAG CGGCAGGGT CGGAACAGGAGAGCGC 
ACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCAC 
CTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAAC 
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GCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGGCT 
CGACAGATCT 
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SEQ ID No. ^ - pONY4.0Z 

CTAAATTGT AAGCGTTAATATTTTGTTAAAATTCGCGTTAAAT TT TTGTTAAATC AG C TC 
ATTTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACCGA 
GATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTC 
CAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACC 
CTAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAG 
CCCCCGATTTAGAGCTTGACGGGGAAAGCCAACCTGGCTTATCGAAATTAATACGACTCA 
CTATAGGGAGACCGGCAGATCTTGAATAATAAAATGTGTGTTTGTCCGAAATACGCGTTT 
TGAGATTTCTGTCGCCGACTAAATTCATGTCGCGGGATAGTGGTGTTTATCGGCGATAGA 
GATGGCGATATTGGAAAAATTGATATTTGAAAATATGGCATATTGAAAATGTCGCCGATG 
TGAGTTTC TGTG TAACTGATAT CG C CATTTTTCC AAAAG TGATTTTTGGGCATACGCGAT 

ATCTGGCGATAG CGCTTATATCGTTTACGGG GGATGGCGATAGACGACTTTGGTGACTTG 
GGCGATTCTGTGTGTCGCAAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCT 
ATATCG CCGATAGAGG CGAC ATCAAGCTGG C ACATGGCCAATG CATATCG ATCTATACAT 
TGAATCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTG 
GCT ATTGGC C ATTGCATACGT TGTAT C CATATCGTAATATGT ACATTTAT ATTGGCT CAT 
GTCCAACATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTA 
CGGGGTCAT TAGTTCATAGCC CATATATGGAG TTCCGCGTTACATAACTTACGGTAAATG 
GCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTC 
CCATAGTAACGC CAATAGGGACT TTCCATTGACGTCAATGGGTGGAGTATTTAC GGT AAA 
CTGCC CACTTGG CAGTACAT CAAGTGTAT CATATGCCAAGT CCGCCCCCTATTGACGTCA 
ATGACGGTAAATGGC CCGC CTGGCATTATGC CC AGTACATGACCTTACGGGAC TTTCCTA 
CTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGT 
ACACCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTG 
ACGTCAATGGGAGTTTGTTTTGGCAC CAAAAT CAACGG GAC TTTC C AAAATGTCGTAACA 

ACTGCGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTA 

TATAAGCAGAGCTCGTTTAGTGAACCGGGCAGTCAGATTCTGCGGTCTGAGTCCCTTCTC 

TGCTGGGCTGAAAAGGCCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGT 

TTGTCTGTTCGAGATCCTACAGTTGGCGCCCGAACAGGGACCTGAGAGGGGCGCAGACCC 

TACCTGTTGAACCTGGGTGATCGTAGGATCCCCGGGACAGCAGAGGAGAACTTACAGAAG 

TCTT CTGGAGGTGT TC CTGG CCAGAACAC AGGAGGACAGGTAAGATGGGAGACCCTTTGA 

CATGGAGCAAGGCG C T CAAGAAGTTAGAG AAGGTGACGGT AC AAGGGTCTCAGAAATTAA 

CTACTGGTAACTGTAATTGGGCGCTAAGT CTAGTAGAC T TATTT CATGATACCAAC TTTG 

TAAAAGAAAAGGACTGGCAG CTGAGGGATGTGATTC CATTGC TGGAAGATGT AACT CAGA 

CGCTGTCAGGACAAGAAAGAGAGGCCTTTGAAAGAACATGGTGGGCAATTTCTGCTGTAA 

AGATGGGCCTCCAGATTAATAATGTAGTAGATGGAAAGGCATCATTCCAGCTCCTAAGAG 

CGAAATATGAAAAGAAGACTGCTAATAAAAAGCAGTCTGAGCCCTCTGAAGAATATCTCT 

AGAACTAGTGGATCCCCCGGGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAG 

GCGGATCCGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCCATATCATAATATGTACATTTATATTGGCTCATGTCC 

AACATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GT CATTAGTTCATAG CCCATATATGGAGTTC CGCGTTACATAACTTACGGTAAATGGC CC 

G CCTGG CTGACCGCC C AACGAC CCC CG C C CATTGACGTCAATAATGACGTATGTTCC C AT 

AGTAACGCCAATAGGGAC TTT C CAT T GACGT C AATGGGTGGAGTATTTACGGTAAAC TGC 

CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGA 

CGGTAAATGG C CCGC CTGGCATTATGC C C AGTACATGAC C TTATGGGACTTTC CTAC TTG 

GCAGTACATC TACGTATTAGTCATCGCTATTACCATGGTGATG CGGTTTTGGCAGTACAT 
CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 
CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTC 
CGCCCCATTGACGCAAATGGGCGGTAGGCATGTACGGTGGGAGGTCTATATAAGCAGAGC 
TCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAG 
AAGACACCGGGACCGATCCAGCCTCCGCGGCCCCAAGCTTCAGCTGCTCGAGGATCTGCG 
GATCCGGGGAATTCCCCAGTCTCAGGATCCACCATGGGGGATCCCGTCGTTTTACAACGT 
CGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTC 
GCCAGCTGGCGT7UVTAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGC 
CTGAATGGCGAATGGCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGG 
CTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCAC 
GGTTACGATGCGCCCATCTACACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTTT 
GTTCCCACGGAGAATCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGG 
CTACAGGAAGGCCAGACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGG 
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TGCAACGGGCGCTGGGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTG 
AGCGCATTTTTACGCGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGTTGGAGTGAC 
GGCAGTTATCTGGAAGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCG 
TTGCTGCATAAACCGACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGAT 
GATTTCAGCCGCGCTGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTAC 
CTACGGGTAACAGTTTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCT 
TTCGGCGGTGAAATTATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTG 
AACGTCGAAAACCCGAAACTGTGGAGCGGCGAAATCCCGAATCTCTATCGTGCGGTGGTT 
GAACTGCACACCGCCGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGC 
GAGGTGCGGATTGAAAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGC 
GTT AAC C GTC ACGAGC ATC AT CC T C TG C ATGGT CAGGTC ATGGATGAG CAGACG ATGGTG 
C AGGATATC C TG CTG ATGAAG CAGAAC AAC TTTAACGC CGTGCG CTGTTCGC AT T ATC CG 
AACCATCCGCTGTGGTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCC 
AAT ATTG AAAC C CACGG C ATGGTGC CAAT GAATCGTCT GAC CGATGATC CGCGC TGGCTA 
CCGGCGATGAGCGAACGCGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTG 
ATGATCTGGTCGCTGGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGC 
TGGATCAAATCTGTCGATCCTTCCCGCCCGGTGCAGTATGAAGGCGGCGGAGCCGACACC 
ACGGC CACCGATATTATTTGCC CG ATGTACGCGCG CGTGGATGAAGAC CAGCC C TTC CCG 
GCTGTGCCGAAATGGTCCATGAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTG 
ATC CTTTGCGAATACG C C CACGCGATGGGTAACAGTCTTGGCGGTTTCGC TAAATAC TGG 
CAGGCGTTTCGTCAGTATCCCCGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAG 
TCGCTGATTAAATATGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGC 
GATACGCCGAACGATCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCG 
CATCCAGCGCTGACGGAAGCAAAACACCAGCAGCAGTTTTTCCAGTTCCGTTTATCCGGG 
C AAAC C AT CGAAGTGAC CAGCGAATACCTGTT C CGTCATAGCGATAACGAGCT C CTGCAC 
TGGATGGTGGCGCTGGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCT 
CCACAAGGTAAACAGTTGATTGAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAA 
CTCTGGCTCACAGTACGCGTAGTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCAC 
ATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCG 
TC C CACGCCAT CC CG C ATCTGAC CACCAGCGAAATGGATTTTTGCATCGAGCTGGGTAAT 
AAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAA 
AAAC AACTG CTGAC GC CGCTG CG CGATCAGTTCAC C CGTGCAC CGCTGGATAACGACAT T 
GGCGTAAGTGAAGCGACCCGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCG 
GGCC ATTAC C AGGC CGAAGCAGCGTTGTTG CAGTG C ACGGCAGATACACTTGCTGATGC G 
GTGCTGATTACGACCGCTCACGCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGG 
AAAAC CT AC CGG ATTGATGGTAGTGGTC AAATGGCGATTAC CGTTGATGTTGAAGTGGCG 
AGCGATACACCGCATCCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAG 
CGGGTAAACTGGCTCGGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCC 
TGTTTTGAC CGCTGGGATCTGC C ATTGT CAGACATGTATAC CC CGTACGTCTTC CCGAG C 
GAAAACGGTCTGCGCTGCGGGACGCGCGAATTGAATTATGGCCCACACCAGTGGCGCGGC 
GACTTCCAGTTCAACATCAGCCGCTACAGTCAACAGCAACTGATGGAAACCAGCCATCGC 
CATCTGCTGCACGCGGAAGAAGGCACATGGCTGAATATCGACGGTTTCCATATGGGGATT 
GGTGGCGACGACTCCTGGAGCCCGTCAGTATCGGCGGAATTCCAGCTGAGCGCCGGTCGC 
TAC C ATTAC CAGTTGGT CTGGTGTC AAAAAT AAT AATAACCGGGC AGGGGGGATC CG CAG 
ATCCGGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGC 
AG AAGTATG CAAAGCATGC C TG CAGGAATT C GATATCAAGCTT ATCGATAC CGTCGAC CT 
CGAGGGGGGGCCCGGTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGGGAAG 
TATTTATCACTAATCAAGCACAAGTAATACATGAGAAACTTTTACTACAGCAAGCACAAT 
CCTCCAAAAAATTTTGTTTTTACAAAATCCCTGGTGAACATGATTGGAAGGGACCTACTA 
GGGTGCTGTGGAAGGGTGATGGTGCAGTAGTAGTTAATGATGAAGGAAAGGGAATAATTG 
CTGT AC CATT AAC C AGGACTAAGTTACT AATAAAACC AAATTGAGTAT TGTTGCAGGAAG 
CAAGACCCAACTACCATTGTCAGCTGTGTTTCCTGAGGTCTCTAGGAATTGATTACCTCG 
ATGCTTCATTAAGGAAGAAGAATAAACAAAGACTGAAGGCAATCCAACAAGGAAGACAAC 
CTCAATATTTGTTATAAGGTTTGATATATGGGAGTATTTGGTAAAGGGGTAACATGGTCA 
GCATCGCATTCTATGGGGGAATCCCAGGGGGAATCTCAACCCCTATTACCCAACAGTCAG 
AAAAATCTAAGTGTGAGGAGAACACAATGTTTCAACCTTATTGTTATAATAATGACAGTA 
AGAACAGC ATGGCAGAAT CGAAGGAAGCAAGAGAC CAAG AAATGAAC CTGAAAGAAGAAT 
CTAAAGAAGAAAAAAGAAGAAATGACTGGTGGAAAATAGGTATGTTTCTGTTATGCTTAG 
CAGGAACTACTGGAGGAATACTTTGGTGGTATGAAGGACTCCCACAGCAACATTATATAG 
GGTTGGTGGCGATAGGGGG AAGAT TAAACGGATCTGGC CAATCAAATGCT ATAGAATG C T 
GGGGTTCCTTCCCGGGGTGTAGACCATTTCAAAATTACTTCAGTTATGAGACCAATAGAA 
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GCATGCATATGGATAATAATACTGCTACATTATTAGAAGCTTTAACCAATATAACTGCTC 
TATAAATAACAAAACAGAATTAGAAACATGGAAGTTAGTAAAGACTTCTGGCATAACTCC 
TTT ACC TAT TT CTTC TGAAG CTAAC ACTGGACT AATTAG ACAT AAGAGAGATTTTGGT AT 
AAGTGCAATAGTGGCAGCTATTGTAGCCGCTACTGCTATTGCTGCTAGCGCTACTATGTC 
TTATGTTG C TCTAAC TGAGGTT AACAAAATAATGGAAGTACAAAAT CATAC TTTTGAGGT 
AGAAAATAGTACTCTAAATGGTATGGATTTAATAGAACGACAAATAAAGATATTATATGC 
TATGATTCTTCAAACACATGCAGATGTTCAACTGTTAAAGGAAAGACAACAGGTAGAGGA 
GACATTTAATTTAATTGGATGTATAGAAAGAACACATGTATTTTGTCATACTGGTCATCC 
CTGG AATAT GTCATG GGGACATTTAAATGAGT CAACAC AATGGGATGACTGGGT AAG C AA 
AATGGAAGATTTAAATCAAGAGATACTAACTACACTTCATGGAGCCAGGAACAATTTGGC 
ACAATCCATGATAACATTCAATACACCAGATAGTATAGCTCAATTTGGAAAAGACCTTTG 
GAGTC ATATTGGAAATTGGATTCCTGG ATTGGGAGC TT C CATTATAAAATAT ATAGTGAT 
GTTTTTGCTTATTTATTTGTTACTAACCTCTTCGCCTAAGATCCTCAGGGCCCTCTGGAA 
GGTGACCAGTGGTGCAGGGTCCTCCGGCAGTCGTTACCTGAAGAAAAAATTCCATCACAA 
AC ATGC AT CGCGAGAAGACACCTGGGAC C AGGCC CAACACAAC ATACAC CT AGCAGGCGT 
G ACCGGTGGAT C AGGGG ACAAAT ACTAC AAGCAGAAGTACT CC AG G AACGAC T GGAATGG 
AGAATCAGAGGAGTACAACAGGCGGCCAAAGAGCTGGGTGAAGTCAATCGAGGCATTTGG 
AGAGAGCTATATTTCCGAGAAGACCAAAGGGGAGATTTCTCAGCCTGGGGCGGCTATCAA 
CGAGC AC AAGAACGGC T CTGGGG GGAAC AATCCTC ACCAAGGGT CCTTAGAC C TGGAGAT 
TCGAAG CGAAGGAGGAAACATTTATGAC TGTTGCATTAAAGC C C AAGAAGGAACTCTCGC 
TATC C CTTGCTGTGGATTT C CCTTATGG CTATTTTGGGGACTAGT AATTATAGTAGGACG 
CATAGCAGGCTATGGATTACGTGGACTCGCTGTTATAATAAGGATTTGTATTAGAGGCTT 
AAATTTGATATTTGAAATAATCAGAAAAATGCTTGATTATATTGGAAGAGCTTTAAATCC 
TGGCACATCTCATGTATCAATGCCTCAGTATGTTTAGAAAAACAAGGGGGGAACTGTGGG 
GTTTTTATGAGGGGTTTTATAAATGATTATAAGAGTAAAAAGAAAGTTGCTGATGCTCTC 
AT AAC C TTGTATAACC CAAAGGACT AGC TCATGTTG CTAGGCAACTAAACCG CAAT AACC 
GCATTTGTGACGCGAGTTCCCCATTGGTGACGCGTTAACTTCCTGTTTTTACAGTATATA 
AGTGCTTGTATTCTGACAATTGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTGCTG 
GGC TGAAAAGG C CTTTGTAATAAATATAATTCTCT ACTCAGTC CCTGTC TCTAGTTTGTC 
TGTTCGAGATCCTACAGAGCTCATGCCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTG 
TGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAA 
GCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCT 
TTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGA 
GGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC 
GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAA 
T CAGGG G ATAACGCAGGAAAGAACATGTGAGC AAAAGGC C AGCAAAAGG CC AGGAAC CGT 
AAAAAGGCCGCGTTGCTGGCGTTTTTCCAfAGGCTCCGCCCCCCTGACGAGCATCACAAA 
AATCG ACG C T C AAGT CAGAGGTGGCGAAAC C CG AC AGGACT ATAAAG AT AC CAGGCGTTT 
CCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTG 
TCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTC 
AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCC 
GACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTA 
T CGCCAC TGGCAG GAG C CAC TGGTAACAGGATT AGCAGAGCGAGG TATGTAGGCGGTGCT 
ACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATC 
TGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAA 
CAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA 
AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAA 
AAC TCACGTT AAGGGATTTTGGTCATGAGATTATCAAAAAGG ATCTTC AC C TAGATCC TT 
TTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGAC 
AGTT AC CAATG C TTAATC AGTGAGG CAC CT ATCT CAGCGATCTGT C TATTTCGTTC ATCC 
ATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGC 
CCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATA 
AACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATC 
CAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGC 
AACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCA 
TTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAA 
GCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCA 
CTC ATGGTT ATGGC AGCACT G C ATAAT T C TC TTAC TGTC ATG C C ATCCGTAAG ATGCTTT 
TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGT 
TG CTC TTG C C CGGC GTCAAT ACG GGATAAT AC CGCG C C AC ATAGCAGAAC TTT AAAAGTG 
CTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGA 
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TCCAGTTCGATGTAACCCACTCGTGGACCCAACTGATCTTCAGCATCTTTTACTTTCACC 
AGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCG 
ACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAG 
GGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGG 
GTTCCGCGCACATTTCCCCGAAAAGTGCCAC 
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SEQ ID No. }o - pONY8.0Z 

AGAT C TTGAATAATAAAATGTGTGTTTGT C CGAAAT ACGCGTTTTGAGATTTCTGTCG C C 

GACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAA 

AAATTGATATTTGAAAATATGG C ATATTGAAAATGT CGC CGATGTGAGTTTC TGTGTAAC 

TGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATACGCGATATCTGGCGATAGCGCT 

TATATCGTTTACGGGGGATGGCGATAGACGACTTTGGTGACTTGGGCGATTCTGTGTGTC 

GCAAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGG 

CGACATCAAGCTGGCACATGGCCAATGCATATCGATCTATACATTGAATCAATATTGGCC 

ATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCA 

TACGTTGTATCCATATCGTAATATGTACATTTATATTGGCTCATGTCCAACATTACCGCC 

ATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCA 

TAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACC 

GCCCAACGACC C C CGCC C ATTG ACGT C AATAATGACGTATGTTC C C AT AGTAACGC C AAT 

AGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGT 

ACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCC 

CGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTA 

CGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGG 

ATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTT 

GTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTGCGATCGCCCGCC 

CCGTTGACGCA7VATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGT 

TTAGTGAACCGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGG 

CCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATC 

CTACAGTTGGCGCCCGAACAGGGACCTGAGAGGGGCGCAGACCCTACCTGTTGAACCTGG 

CTGATCGTAGGATC CCCGGGACAGC AGAGGAGAACTTAC AGAAGTCTTC TGGAGGTGTT C 

CTGGC CAGAACAC AGGAGGAC AGGTAAGATTGGGAGAC C CTTTGACAT TGGAGCAAGGCG 

CTCAAGAAGTTAGAGAAGGTGACGGTACAAGGGTCTCAGAAATTAACTACTGGTAACTGT 

AATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACCAACTTTGTAAAAGAAAAGGAC 

TGGCAGCTGAGGGATGTGATTCCATTGCTGGAAGATGTAACTCAGACGCTGTCAGGACAA 

GAAAGAGAGGCCTTTGAAAGAACATGGTGGGCAATTTCTGCTGTAAAGATGGGCCTCCAG 

ATTAATAATGTAGTAGATGGAAAGGCATC AT TCCAGCT C CTAAGAGCGAAATATGAAAAG 

AAGACTGCTAATAAAAAGCAGT CTGAGCC CT CTG AAGAATATCTCTAGAACTAGTGGATC 

CCCCGGGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGCGGATCCGGCCAT 

TAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATA 

CGTTGTATCCATATCATAATATGTACATTTATATTGG CTCATGTCCAACATTACCG C CAT 

GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATA 

GCC CATAT ATG GAGTTC CG CGT TACATAAC TT ACGGTAAATGGC C CGCCTGGCTGAC CGC 

CCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAG 

GGACTTTC C AT TGACGT CAATG GGTGGAGT ATTT ACGGT AAACTGC CC AC TTG G CAGTAC 

ATCAAGTGTAT CAT ATG C CAAGTACG C C C C CT AT TGACGT C AATGACGGT AAATGGCCCG 

CCTGG CATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACG 

TATTAGT CAT C GCTATTAC CATGGTGATGCGGTTTTGGC AGTAC ATCAATGGG CG TGG AT 

AGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGT 

TTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGC 

AAATGGGCGGTAGGCATGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACC 

GTCAGAT CGC C TGGAGACGC C AT C CACG CTGTTTTGACCTC CATAGAAGAC AC CGGG AC C 

GATCCAGCCTCCGCGGCCCCAAGCTTCAGCTGCTCGAGGATCTGCGGATCCGGGGAATTC 

CCCAGTCTCAGGATCCACCATGGGGGATCCCGTCGTTTTACAACGTCGTGACTGGGAAAA 

CCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAA 

TAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATG 

GCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCT 

TCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCC 

CATCT ACAC CAACGTAAC C TAT C C CATTAC GGT C AAT CCGC CGTTTGTTC C CACGGAGAA 

TCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCA 

GACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTG 

GGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACG 

CGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGTTGGAGTGACGGCAGTTATCTGGA 

AGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACC 

GACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCAGCCGCGC 

TGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGT 

TTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCTTTCGGCGGTGAAAT 
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TATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCC 
GAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGC 
CGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGAGGTGCGGATTGA 
AAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGA 
GCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCT 
GATGAAGCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGGTGTG 
GTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCA 
CGGCATGGTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTACCGGCGATGAGCGA 
ACGCGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCT 
GGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCTGGATCAAATCTGT 
CGAT C CT TCC CG C C CGGTGC AGTATGAAGG C GGCGGAG C CGAC ACC ACGG C C ACC GAT AT 
TATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATG 
GTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCGTTTGCGAATA 
CGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCA 
GTATCCCCGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATA 
TGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGA 
TCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCGCATCCAGCGCTGAC 
GGAAGCAAAACACCAGCAGCAGTTTTTCCAGTTCCGTTTATCCGGGCAAACCATCGAAGT 
GAC CAGCGAATAC CT GTTCCGT C ATAGCGATAACGAG CTCCTGCACTGGATGGTGGCGCT 
GGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACA 
GTTGATTGAAC TGC C TGAAC TAC CG CAGCCGGAG AGCGCCGGGCAACTCTGGCTCACAGT 
ACGCGTAGTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCA 
GCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCC 
GCATCTGACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAATT 
TAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTGCTGAC 
GCCGCTGCGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATTGGCGTAAGTGAAGC 
GACCCGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCAGGC 
CGAAGCAGCGTTGTTGCAGTGCACGGCAGATACACTTGCTGATGCGGTGCTGATTACGAC 
CGCTCACGCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGGAAAACCTACCGGAT 
TGATGGTAGTGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCA 
TGCGGCGCGGATTGGGCTGAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCT 
CGGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTG 
GG ATCTGCCATTGTC AGACATGTATAC CC CGTACGTC TTC C CGAGCGAAAACGGTCTGCG 
CTGC GGGACGCGCGAATTGAATTATGGC C CACAC C AGTGG CGCGGCGACTTCCAGTTCAA 
CATGAGCCGCTACAGTCAACAGCAACTGATGGAAACCAGCCATCGCCATCTGCTGCACGC 
GGAAGAAGGCACATGGCTGAATATCGACGGTTTCCATATGGGGATTGGTGGCGACGACTC 
CTGGAG C C CGT CAGTATCGGCGGAATTC CAG CTGAGCGC CGGTCGCTACCATTACCAGT T 
GGTCTGGTGTCAAAAATAATAATAAC CGGG CAGGGGGGAT C CGCAGATCCGGC TGTGGAA 
TGTGTGT C AGTTAGGGTGTGGAAAGTC C C CAGGCTC C C C AGCAGGCAGAAGTATG CAAAG 
CATGCCTGCAGGAATTCGATATCAAGCTTATCGATACCGTCGACCTCGAGGGGGGGCCCG 
GTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGGGAAGTATTTATCACTAAT 
CAAGC AC AAGT AATACATGAGAAACTTTTAC T ACAGC AAG C ACAATG CT CCAAAAAATTT 
TGTTTTTACAAAATC C C TGGTGAAC ATGATTGGAAGGG AC C TACT AGGGTGCTGTGGAAG 
GGTGATGGTGC AGTAGT AGTTAATGATGAAGGAAAGGGAATAATTGCTGTAC CATTAAC C 
AGGACTAAG TTACTAAT AAAAC CAAATTGAGTATTGTTG C AGGAAGC AAG AC C C AACTAC 
CATTGTCAGCTGTGTTTCCTGACCTCAATATTTGTTATAAGGTTTGATATGAATCCCAGG 
GGGAATCTC AAC C CC TATTAC C CAACAGTC AGAAAAAT CTAAGTGTG AGGAGAACACAAT 
GTTT CAAC C T TATTGTTATAATAATGAC AGTAAGAAC AG C ATGGC AG AAT CGAAGG AAG C 
AAGAGACC AAGAATGAAC CTGAAAGAAG AATCTAAAGAAGAAAAAAG AAGAAATG AC TGG 
TGG AAAATAGGTATGTTTC TGTT ATGCTTAGC AGGAACT ACTGGAGG AAT AC TTTGGTGG 
TATGAAGGACTCCCACAGCAACATTATATAGGGTTGGTGGCGATAGGGGGAAGATTAAAC 
GGATCTGGCCAATCAAATGCTATAGAATGCTGGGGTTCCTTCCCGGGGTGTAGACCATTT 
C AAAATT AC TT C AGTT AT GAGAC CAATAG AAGCATGC ATATG GAT AATAATACTG CTACA 
TTATTAGAAG C TTT AAC CAATATAACTGC T CTATAAATAACAAAACAG AATTAGAAAC AT 
GGAAGTTAGTAAAGAC TTCTGG CAT AACTC CTTTAC CTAT TT CTTCTGAAGC TAACACTG 
GACTAATTAGACAT AAGAGAG ATTTTGGT ATAAGT G CAATAGTGGC AGC TATTGTAGCCG 
CTACTGCTATTGCTGCTAGCGCTACTATGTCTTATGTTGCTCTAACTGAGGTTAACAAAA 
TAATGGAAGTACAAAATCATACTTTTGAGGTAGAAAATAGTACTCTAAATGGTATGGATT 
TAATAGAACGACAAATAAAGATATTATATG CTATGATTCTT CAAAC ACATG C AGATGTTC 
AACTGTTAAAGGAAAGACAACAGGTAGAGGAGACATTTAATTTAATTGGATGTATAGAAA 
GAACAC ATG TATTTTGT C ATACTGGT CAT C C CTGGAATATGTCATGGGGAC ATTT AAATG 
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AGTCAACACAATGGGATGACTGGGTAAGCAAAATGGAAGATTTAAATCAAGAGATACTAA 
CTACACTTCATGGAGCCAGGAACAATTTGGCACAATCCATGATAACATTCAATACACCAG 
AT AGT AT AG CT C AATTTGGAAAAGACCTTTGGAGT CATATTGGAAATTGGATTC CTGG AT 
TGGGAGCTTCCATTATAAAATATATAGTGATGTTTTTGCTTATTTATTTGTTACTAACCT 
CTTCGCCTAAGATCCTCAGGGCCCTCTGGAAGGTGACCAGTGGTGCAGGGTCCTCCGGCA 
GTCGTTACCTGAAGAAAAAATTCCATCACAAACATGCATCGCGAGAAGACACCTGGGACC 
AGGCCCAACACAACATACACCTAGCAGGCGTGACCGGTGGATCAGGGGACAAATACTACA 
AGCAGAAGTACTCCAGGAACGACTGGAATGGAGAATCAGAGGAGTACAACAGGCGGCCAA 
AGAGCTGGGTGAAGTCAATCGAGGCATTTGGAGAGAGCTATATTTCCGAGAAGACCAAAG 
GGGAGATTTCTCAGCCTGGGGCGGCTATCAACGAGCACAAGAACGGCTCTGGGGGGAACA 
AT C CTC ACC AAGGGTC CTTAGAC CTGGAGATTCGAAGCGAAGGAGGAAACATTTATG ACT 
GTTGCATTAAAG C C C AAGAAGGAAC TCTCG CTATCC CTTG C TGTGGATTT CC CTTATGGC 
TATTTTGGGGACTAGTAATTATAGTAGGACGCATAGCAGGCTATGGATTACGTGGACTCG 
CTGTTATAATAAGGATTTGTATTAGAGGCTTAAATTTGATATTTGAAATAATCAGAAAAA 
TGC TTG ATTATATTGGAAGAGCTTT AAAT C CTGGC ACATCTC ATGT ATC AATGC CTCAGT 
ATGTTTAGAAAAACAAGGGGGGAACTGTGGGGTTTTTATGAGGGGTTTTATAAATGATTA 
TAAGAGTAAAAAGAAAGTTGC TGATG C TCT CAT AAC CTTGTATAAC CC AAAGGACTAGCT 
CATGTTGCTAGGC AAC TAAAC C G CAATAAC CGCATTTGTG ACGCGAGTTC C CC ATTGGTG 
ACGCGTTAACTTCCTGTTTTTACAGTATATAAGTGCTTGTATTCTGACAATTGGGCACTC 
AGATTCTGCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGGCCTTTGTAATAAATATAA 
TTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATCCTACAGAGCTCATGCCTT 
GGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACA 
CAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACT 
CACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCT 
GCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGC 
TTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCA 
CTCAAAGGCGGT AAT ACGGTTATC CACAGAATCAGG GGATAACG CAGGAAAGAAC ATGTG 
AGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA 
TAGG C TC CGCCC CCCTGACGAGCATCACAAAAATCGACGCT CAAGT CAGAGGTGG CGAAA 
CCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCC 
TGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGC 
GCTTTCT.CATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCT 
GGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG 
TCTTGAGTC CAAC C CGGTAAGACACGACTTATCGCC ACTGGC AG C AGCCACTGGTAACAG 
GATTAGCAGAGCGAGGTATGTAG G CGGTGCTAC AGAGTTCTTGAAGTGGTGG CCTAACT A 
CGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGG 
AAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTT 
TGTT TGCAAG CAGCAG ATTACGCG CAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTT 
TTCTACGGGGTC TGACGCTCAGTGGAACGAAAACT CACGTTAAG GGATTTTGGTC ATGAG 
ATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAAT 
CT AAAGTATAT ATG AGTAAAC TTGGTCTGACAGTTAC CAATGCTTAATCAGTG AGGCACC 
TATCT CAGCGATC TGT CTATTTCGT T C ATC CATAGTTG C CTG AC T C C C CGTCGTGTAGAT 
AACT ACGATACGGGAGGGCTT AC CAT CTGG C CC C AGTG CTGC AATGATACCGC GAGAC C C 
ACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAG 
AAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAG 
AGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGT 
GGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCG 
AGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGT 
TGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTC 
TCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTC 
ATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAA 
TACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCG 
AAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACC 
CAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAG 
GCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTT 
CCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATT 
TGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCC 
ACCTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGC 
TC AT T TTTTAACCAATAGG C CGAAATCGGC AAAAT CC CTTATAAATC AAAAGAATAGAC C 
GAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGAC 
TCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCA 
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SEQ ID No. II - pONY8.1Z 

AGATCTTGAATAATAAAATGTGTGTTTGTCCGAAATACGCGTTTTGAGATTTCTGTCGCC 
GACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAA 
AAATTGATATTTGAAAATATGGCATATTGAAAATGTCGCCGATGTGAGTTTCTGTGTAAC 
TGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATACGCGATATCTGGCGATAGCGCT 
TATATCGTTTACGGGGGATGGCGATAGACGACTTTGGTGACTTGGGCGATTCTGTGTGTC 
GCAAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGG 
CG AC ATC AAGCTGG C ACATGGC C AATGCAT ATCGATCTATACATTGAAT C AAT ATTGG C C 
ATT AG C CAT ATT ATT CATTGGTTAT ATAG CAT AAAT C AATATTGGCTATTGGC CAT TGC A 
TACGTTGTATCCATATCGTAATATGTACATTTATATTGGCTCATGTCCAACATTACCGCC 
ATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCA 
TAGCC CATATATGGAGTTC C G CGTTAC ATAACTTACGGTAAATGGC C CG C CTGGCTGAC C 
GCCC AACGACCCC CGC C CATTG ACGTCAATAATGACGT ATGTTCCCAT AGTAACG C CAAT 
AGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGT 
ACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCC 
CGC CTGG CATTATG C CCAGTACATGAC CTTACGGGACTTTCC TACTTGG CAG TAC ATC TA 
CGTATTAGTCAT CGCTATTAC C ATGGT GATGCGGTT TTGGCAGTACACC AATGGG CG TGG 
ATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTT 
GTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTGCGATCGCCCGCC 
C CGTTG ACG C AAAT GGG CGGTAGGCGTGTACGGTGGGAGGTCT AT ATAAG CAG AG C T C GT 
TTAGTGAACCGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGG 
CCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATC 
CTAC AGT TGGCG C C CG AAC AGGGAC CTGAGAGGGGC GC AGACCCTAC C TGTTGAAC CTGG 
CTGATCGTAGGAT C CC CGGGACAGC AGAGGAGAAC TTAC AG AAGTCTTCTGGAGG TGTTC 
CTGGCCAGAACACAGGAGGACAGGTAAGATTGGGAGACCCTTTGACATTGGAGCAAGGCG 
CTCAAGAAGTTAGAGAAGGTGACGGTACAAGGGTCTCAGAAATTAACTACTGGTAACTGT 
AATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACCAACTTTGTAAAAGAAAAGGAC 
TGGCAGCTGAGGGATGTCATTCCATTGCTGGAAGATGTAACTCAGACGCTGTCAGGAGAA 
GAAAGAGAGG C C TTTGAAAGAAC ATGGTGGGCAATT TCTGCTGTAAAGATGGGC CTCCAG 
ATTAATAATGTAGTAGATGGAAAGGCATCATTC CAGCTC CTAAGAGCGAAATATGAAAAG 
AAGACTGCTAAT AAAAAGC AGTC TGAG CCCTCTGAAGAAT AT CT CTAGAACTAGTGGATC 
CCCCGGGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGCGGATCCGGCCAT 
T AGCCATATTATT C ATTGGT TATATAGCATAAATC AAT ATTGGC TATTGGCCATTG C ATA 
C GTTGT AT C CATATCATAAT ATGTAC ATTT ATATTGGCTCATGTC CAACATTAC CG C CAT 
GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATA 
GCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGG CCCGCCTGGC TGAC CGC 
CCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAG 
GGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTAC 
AT CAAGTGTAT CATATGC CAAGTACG C CC C CTATTGACGTCAATGACGGTAAATGG CCCG 
CCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACG 
T ATT AGTC AT CGCTATTAC CATGGTGATG CGGTTTTGG CAGTACAT CAATGGGCGTGGAT 
AG CGGTTTGACT CAC GGGGATTTC C AAGT CT C CACC C C ATTGACGT CAATGGGAGTTTGT 
TTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGC 
AAATGGGCGGTAGG CATGTACGGTGGG AGGT CT ATATAAG C AGAG CT CGTTT AGTGAAC C 
GT CAGAT CGC CTGG AGACGC CAT C CACG C TGTTTTGACCTC CAT AG AAG ACACCGGGACC 
GATCCAGCCTCCGCGGCCCCAAGCTTCAGCTGCTCGAGGATCTGCGGATCCGGGGAATTC 
CCCAGTCTCAGGATCCACCATGGGGGATCCCGTCGTTTTACAACGTCGTGACTGGGAAAA 
CCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAA 
TAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATG 
GCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCT 
TCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCC 
CATCTACACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTTTGTTCCCACGGAGAA 
TCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCA 
GACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTG 
GGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACG 
CGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGTTGGAGTGACGGCAGTTATCTGGA 
AGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACC 
GACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCAGCCGCGC 
TGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGT 
TTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCTTTCGGCGGTGAAAT 



30 



WO 01/79518 



PCT/GB01/01784 



TATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCC 

GAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGC 

CGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGAGGTGCGGATTGA 

AAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGA 

GCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCT 

GATGAAGCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGCTGTG 

GTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCA 

CGGCATGGTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTACCGGCGATGAGCGA 

ACGCGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCT 

GGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCTGGATCAAATCTGT 

CGATCCTTCC CGC C CG GTGCAGTATGAAGGCGGCGGAG C CGAC AC CACGGCCAC CGATAT 

TATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATG 

GTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCCTTTGCGAATA 

CGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCA 

GTAT C CC CGTTTACAGGGCGGCTT CGTCTGGG ACTGGGTGGAT C AGTCGCTGATTAAATA 

TGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGA 

TCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCGCATCCAGCGCTGAC 

GG AAG CAAAAC AC C AGC AGCAGTTTTT CC AGTTC CGTTT AT C C GGGCAAAC CATCGAAGT 

G AC C AGCGAATAC C TGTTCCGT CATAGCGAT AAC GAGCTCC TGCACTGGATGGTGGCG CT 

GGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACA 

GTTGATTGAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGCTCACAGT 

ACGCGTAGTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCA 

GCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGC CATC CC 

GCATCTGACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAATT 

TAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTGCTGAC 

GCCGCTGCGCG ATC AGTTCACCCGTG CAC CGCTGGATAACGACATTGG CGTAAGTGAAGC 

GACCCGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCAGGC 

CGAAGCAGCGTTGTTGCAGTGCACGGCAGATACACTTGCTGATGCGGTGCTGATTACGAC 

CG CTC ACGCGTGGC AGC ATCAGGGGAAAAC CTT ATTTAT C AGCCGGAAAAC CTAC CGGAT 

TGATGGTAGTGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCA 

TCCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCT 

CGGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTG 

GGATCTGCC ATTGTC AGAC ATGTATACC CCGTACGT CTT C CC GAGCGAAAACGGT CTGCG 

CTGCGGGACGCGCGAATTGAATTATGGCCCACACCAGTGGCGCGGCGACTTCCAGTTCAA 

CATCAGCCGC TAC AGTCAACAG CAACTG ATGGAAACC AGC CATCGC CATC TGCTGCACGC 

GGAAGAAGG CACATGG C TGAATATCGACGGTTTCCATATGGGGATTGGTGG CG ACGACTC 

CTGGAG CCCG TCAG TATCGGCGGAATTCC AGCTGAG CG CCGGTCGCTAC C ATT AC CAGTT 

GGTCTGGTGTCAAAAATAATAATAACCGGGCAGGGGGGATCCGCAGATCCGGCTGTGGAA 

TGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAG 

CATGCCTGCAGGAATTCGATATCAAGCTTATCGATACCGTCGAATTGGAAGAGCTTTAAA 

TC CTGGCACATCTCATGTATC AATG CC TCAG TATGTTTAG AAAAAC AAGGGGGG AACTGT 

GGGGTTTTTATGAGGGGTTTTATAAATGATTATAAGAGTAAAAAGAAAGTTGCTGATGCT 

C TCATAAC C TTGTATAACCC AAAGG AC TAG CTC ATGTTGCTAGGC AACTAAACCGCAATA 

ACCGCATTTGTGACGCGAGTTCCCCATTGGTGACGCGTTAACTTCCTGTTTTTACAGTAT 

ATAAGTGCTTGTATTCTGACAATTGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTG 

CTGGGCTGAAAAGGCCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTT 

GTCTGTTCGAGATCCTACAGAGCTCATGCCTTGGCGTAATCATGGTCATAGCTGTTTCCT 

GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGT 

AAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCC 

GCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGG 

AGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCG 

GTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACA 

GAATC AGGGGATAACGCAGG AAAGAACATGTGAG CAAAAGG C CAG CAAAAGG CCAG GAAC 

CGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCAC 

AAAAATCG ACGCTC AAGTCAGAGGTGGCG AAACC CGAC AGGACTAT AAAGATAC CAGGCG 

TTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATAC 

CTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTAT 

CTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAG 

CCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGAC 

TTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGT 

GCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGT 
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ATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGC 
AAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTAGGCGCAGA 
AAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAAC 
GAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATC 
CTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCT 
GACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCA 
TCC AT AGTTGC CTG ACTCC CCGTCGTGTAGATAACTACGATACGGG AGGGC TTACCAT CT 
GGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCA 
ATAAAC C AGCC AGC CGGAAGGGC CGAGCGC AGAAGTGGT CC TGC AAC TTT AT C CG CCT C C 
ATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTG 
CGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCT 
TC ATTCAGCTC CGGTTCCC AACG AT CAAGGCGAGTTAC ATGATCCC C C ATGTTGTG C AAA 
AAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTA 
T CACTCATGGTTATGG CAGC ACTGCATAATT C T CTTAC TGT CATGC C ATCCGTAAGATG C 
TTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCG 
AGTTGCT CTTG C CCGGCGTC AATACGGGATAATACCGCGC C ACATAGC AGAACTTTAAAA 
GTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTG 
AGAT CC AGTTCG ATGTAACC CACT CGTGCAC C CAAC TGATCTTC AGCATCTTTTACTTTC 
ACCAGCGTTTCTGGGTGAGCAAAAAGAGGAAGGGAAAATGCCGCAAAAAAGGGAATAAGG 
GCGACACGGAAATGTTGAAT ACTC ATACTC TT C CTTTTTC AATATTATTGAAGC ATTTAT 
CAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATA 
GGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTAAATTGTAAGCGTTAATATTTTGT 
TAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCG 
G CAAAATCCC TTATAAATCAAAAGAATAGACCGAG ATAGGGTTGAGTGTTGTTCCAGTTT 
GGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACCGTCT 
ATCAGGGCGATGGC C CACTACGTGAAC CAT CACC C TAATC AAGTTT TTTGGGGTCGAG GT 
G CCGTAAAGCACTAAATCGGAAC C CTAAAGGGAGC CC C CGATTTAGAGCTTGACGGGGAA 
AGC C AACCTGGCTT ATCGAAATTAAT ACGAC T CACTATAGGG AGAC CGGC 
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SEQ ID No. \% - pONY3.1 

AGATCTTCAATATTGGC C AT TAGCC ATATT ATT CATTGGTTATATAGCATAAAT C AAT AT 
TGGC TATTGGC CATTGC AT ACGTTGTATCT ATATC ATAATATGT ACATTTATATTGGC TC 
ATGTCCAATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAAT 
TACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAA 
TGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGT 
TCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTA 
AACTGC C C ACT TGGC AGTAC ATC AAGTGTAT CATATGC CAAGT CCGCCCCC TATTGACGT 
C AATGACGGT AAATGGC C CG CCTGGC ATTATG CC C AGTAC ATGACC TTACGGGACTTTC C 
TACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCA 
GTACACCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCAT 
TGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAA 
CAACTGCGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTC 
TATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTT 
TATCACAGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGC 
TGCAGTGACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAG 
TAT CAAGGTTACAAGAC AGGTTT AAGGAGAC CAATAGAAACTGGG CTTGT CGAGAC AGAG 
AAGACTCTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCT 
CTC C ACAGGTGT C C ACTC C C AGTT CAATTACAGCTC TTAAGGCT AG AGTAC TTAATACGA 
CT CAC TATAGGCTAGC CTC GAGGTCGACGGTAT CGC C CGAACAGGGACCTGAGAGGGGCG 
C AGAC C CTACCTGTTGAAC CTGGC TGATCGTAGGATC CC CGGGAC AGCAG AGGAGAACTT 
ACAGAAGT C TT C TGGAGGTGTTCC TG GC CAGAACAC AGGAGGAC AGGTAAGATGGGAGAC 
CCTTTGACATGGAGCAAGGCGCTCAAGAAGTTAGAGAAGGTGACGGTACAAGGGTCTCAG 
AAATTAACTACTGGTAACTGTAATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACC 
AACTTTGTAAAAGAAAAGGACTGGCAGCTGAGGGATGTCATTCCATTGCTGGAAGATGTA 
ACTC AGACGC TGTC AGG AC AAGAAAGAGAGG CCTTTGAAAGAAC ATG GTGGGCAATTTCT 
GCTGT AAAGATGGGC CTC C AGATTAATAATGTAGTAGATGGAAAGGCATC ATTC CAGCTC 
CTAAGAGCGAAATATGAAAAGAAGACTG C T AATAAAAAG CAGTCTGAGC C CTCTGAAGAA 
TATCCAATCATGATAGATGGGGCTGGAAACAGAAATTTTAGACCTCTAACACCTAGAGGA 
TATACTACTTGGGTGAATACCATACAGACAAATGGTCTATTAAATGAAGCTAGTCAAAAC 
TTATTTGGGATATT ATC AGTAGACTGTACTT CTGAAGAAATGAATG CATTTTTGGATGTG 
GTAC CTGGC CAGG CAGG AC AAAAG CAG AT ATT ACTTGATG CAATTGATAAGATAGCAGAT 
GATTGGG AT AATAGACATC C ATTAC CGAATGC TC CACTGGTGGC AC CAC CAC AAGGGC CT 
AT TCCCATGAC AG CAAGGTTTAT TAGAGGTTT AGGAGTAC CTAGAGAAAGAC AGATGGAG 
C C TGCTTTTGAT CAGTTTAGGC AGACATATAGACAATGGATAATAGAAGC CATGTCAGAA 
GG CATCAAAGTGATGATTGGAAAACCTAAAGCTC AAAATATTAGG CAAGGAGCTAAGGAA 
C C TTAC CCAGAATTTGTAGACAGACTATT ATC CC AAATAAAAAGTGAGGGAC ATC CAC AA 
GAGATTTCAAAATTCTTGACTGATACACTGACTATTCAGAACGCAAATGAGGAATGTAGA 
AATGCTATGAGACATTTAAGACCAGAGGATACATTAGAAGAGAAAATGTATGCTTGCAGA 
GAC ATTGGAAC TAC AAAACAAAAGATGATGTTATTGG CAAAAGC ACTTCAG AC TGGTCTT 
GCGGGCCCATTTAAAGGTGGAGCCTTGAAAGGAGGGCCACTAAAGGCAGCACAAACATGT 
TATAAC TGTGGGAAGC CAG GACATTTATC T AGTCAATGTAGAGC AC CTAAAGTC TGTTTT 
AAATGTAAACAGCCTGGACATTTCTCAAAGCAATGCAGAAGTGTTCCAAAAAACGGGAAG 
CAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGATACAACAGAAGAGTCAGCAC 
AACAAATCTGTTGTACAAGAGACTC CTC AGAC TCAAAAT C TGTAC C CAGATCTGAGCG AA 
ATAAAAAAGGAATACAATGTCAAGGAGAAGGATC AAGT AGAGGAT CT CAAC CTGG ACAGT 
TTGTGGGAGTAACATATAATCTAGAGAAAAGGCCTACTACAATAGTATTAATTAATGATA 
CTC C CTT AAATGTAC TGT TAGAC AC AGGAG CAGATACTT C AGTGTTG ACT AC TGCACATT 
ATAATAGGTTAAAATATAGAGGGAGAAAATATCAAGGGACGGGAATAATAGGAGTGGGAG 
GAAATGTGGAAACATTTTCTACGCCTGTGACTATAAAGAAAAAGGGTAGACACATTAAGA 
CAAGAATGCT AGTGGCAGATATT C C AGTGACT ATTTTGGGAC GAG ATATT CTTCAGG ACT 
T AGGTGC AAAATTGGT TTTGGCACAGCTC TC C AAG GAAAT AAAATTTAGAAAAATAG AGT 
TAAAAGAGGG CACAATGGGGCC AAAAATT C C TCAATGGC C ACTC ACTAAGGAG AAACTAG 
AAGGGG C CAAAGAGATAG TCCAAAG ACTATTGTC AGAGGGAAAAATATCAG AAGCTAGTG 
AC AAT AATC CTT ATAATTCACC C AT ATTTGT AAT AAAAAAGAGGT CTGGCAAATGGAG GT 
T ATTAC AAGAT CTGAGAGAATTAAACAAAAC AGT ACAAGT AGGAACGGAAATATC CAGAG 
GATTGCCTCACCCGGGAGGATTAATTAAATGTAAACACATGACTGTATTAGATATTGGAG 
ATG CATATTTC ACTATAC C CTT AG ATC CAG AGTTTAGAC C AT ATACAG CTTTCACTATT C 
C CT C C ATT AATCATC AAGAAC CAG ATAAAAGAT ATGTGTGGAAATGTTTACC AC AAGG AT 
TCGTGTTGAGCCCATATATATATCAGAAAACATTACAGGAAATTTTACAACCTTTTAGGG 
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AAAGATATCCTGAAGTACAATTGTATCAATATATGGATGATTTGTTCATGGGAAGTAATG 
GTTCTAAAAAACAACACAAAGAGTTAATCATAGAATTAAGGGCGATCTTACTGGAAAAGG 
GTTTTGAGACACCAGATGATAAATTACAAGAAGTGCCACCTTATAGCTGGCTAGGTTATC 
AACTTTGTC C TGAAAATTGGAAAGTACAAAAAATGC AATTAGAC ATGGTAAAGAATCC AA 
C C CTTAATGATGTG CAAAAATTAATGGGGAATATAACATGGATGAGCTC AGGGATC C CAG 
GGTTGACAGTAAAACACATTGCAGCTACTACTAAGGGATGTTTAGAGTTGAATCAAAAAG 
TAATTTGGACGGAAGAGGCACAAAAAGAGTTAGAAGAAAATAATGAGAAGATTAAAAATG 
CTC AAGGGTTAC AATATTATAAT C CAGAAGAAGAAATGTT ATGTGAGGTTGAAATTAC AA 
AAAATTATGAGGCAACTTATGTTATAAAACAATCACAAGGAATCCTATGGGCAGGTAAAA 
AGAT TATGAAGGCTAAT AAGGGATGGTCAACAGTAAAAAATTTAATGTTATTGTTG C AAC 
ATGTGGCAACAGAAAGTATTACTAGAGTAGGAAAATGT£CAACGTTTAAGGTACCATTTA 
C CAAAGAGCAAGTAATGTGGGAAATG CAAAAAGGAT GGTATTATTCTTGG CTC C C AGAAA 
TAGTATATACACATCAAGTAGTTCATGATGATTGGAGAATGAAATTGGTAGAAGAACCTA 
CAT CAGGAATAACAATATACACTGATGGGG GAAAACAAAATGGAGAAGGAAT AGC AGC TT 
ATGTGAC C AGTAATGGGAGAACTAAACAGAAAAGGT TAG GAC CTGTCACT CATCAAGT TG 
CTGAAAGAATGGCAATACAAATGGCATTAGAGGATACCAGAGATAAACAAGTAAATATAG 
TAACTGAT AGTTAT TATTGTTGGAAAAAT AT TACAGAAGGATTAGGTTT AGAAGGAC CAC 
AAAGTCCTTGGTGGCCTATAATACAAAATATACGAGAAAAAGAGATAGTTTATTTTGCTT 
GGGTACC TGGTCACAAAGGGATATATGGTAATCAATTGGC AGATGAAGC CGC AAAAATAA 
AAGAAGAAATCATGCTAGCATACCAAGGCACACAAATTAAAGAGAAAAGAGATGAAGATG 
CAGGGT TTGACTTATGTGTTC CTTATG AC ATC ATGATAC C TGTATCTGAC ACAAAAAT CA 
TAG C GACAGATGTAAAAATT CAAGTTCCT CCTAATAGC TT TGGATGGGT CACTGGGAAAT 
CAT CAATGG CAAAACAG GGGT TATTAATTAATGGAGGAATAATTGATGAAGG ATATACAG 
GAGAAATACAAGTGATATGTACTAATATTGGAAAAAGTAATATTAAATTAATAGAGGGAC 
AAAAATTTGC AC AATT AATTATACT AC AGC ATC ACT CAAATT C C AGACAG C CTTGGGATG 
AAAATAAAATATCTCAGAGAGGGGATAAAGGATTTGGAAGTACAGGAGTATTCTGGGTAG 
AAAATATTCAGGAAGCACAAGATGAACATGAGAATTGGCATACATCACCAAAGATATTGG 
CAAGAAATTATAAGATACCATTGACTGTAGCAAAACAGATAACTCAAGAATGTCCTCATT 
GCACTAAGCAAGGATCAGGACCTGCAGGTTGTGTCATGAGATCTCCTAATCATTGGCAGG 
CAGATTG CACACATTTGGACAATAAGATAATATTGACTTTTGTAGAGTCAAATTCAGGAT 
ACATACATGCTACATTATTGTCAAAAGAAAATGCATTATGTACTTCATTGGCTATTTTAG 
AATGGG CAAGATTGTTTTCACCAAAGT C CTTACAC ACAG ATAACGGCACT AATTTTGTGG 
C AGAACCAGTTGT AAATTTGTTGAAGTTC C TAAAGATAGCACATAC CAC AGGAATACC AT 
ATCAT C CAGAAAGTC AGGGTAT TGTAGAAAGGG CAAATAGGAC CTTGAAAGAGAAGATTC 
AAAGTCATAGAGACAACACTCAAACACTGGAGGCAGCTTTACAACTTGCTCTCATTACTT 
GTAACAAAGGGAGGGAAAGTATGGGAGGACAGACACCATGGGAAGTATTTATCACTAATC 
AAG C ACAAGTAATACATGAG AAACTTTT ACT AC AGC AAGCAC AATC CT CGAAAAAATTTT 
GTTTTTACAAAATCC CTGGTGAACATGATTGGAAGGGAC CTACTAGGGTGCTGT GGAAGG 
GTGATGGTGCAG TAGTAGTTAATGATG AAGG AAAGGGAAT AATTG CTGTACC AT TAACC A 
GG AC TAAGTT AC T AATAAAAC C AAATTGAGTATTGTTGCAGG AAGCAAGAC C C AACT ACC 
ATTGTCAGCTGTGTTTCCTGAGGTCTCTAGGAATTGATTACCTCGATGCTTCATTAAGGA 
AGAAGAATAAACAAAGACTGAAGGCAATCCAACAAGGAAGACAACCTCAATATTTGTTAT 
AAG GTTTGATATATGGGAGTATTTGGTAAAGGGGTAACATGGT CAGCATCG CATTC T ATG 
GGGGAAT CC C AGG GGGAATCTCAAC C C C TATTAC CCAACAGTC AGAAAAATCTAAGTGTG 
AGG AGAACACAATGTTT CAACCTTAT TGT TAT AATAATGACAGTAAGAAC AGC ATGG CAG 
AATCGAAGGAAGCAAGAGACCAAGAAATGAACCTGAAAGAAGAATCTAAAGAAGAAAAAA 
GAAGAAATGACTGGTGGAAAATAGGTATGTTTCTGTTATGCTTAGCAGGAACTACTGGAG 
GAATACTTTGGTG GT ATG AAGGACT C C CACAGCAACATTATATAGGGTTGGTGGCGATAG 
GGGGAAGATTAAACGGATCTGGCCAATCAAATGCTATAGAATGCTGGGGTTCCTTCCCGG 
GGTGTAGAC C ATTTCAAAATTACTTC AGTTATGAGACC AATAGAAGCATG CAT ATGG ATA 
AT AATACTGCTACATTATTAGAAGC TT TAAC CAATATAAC TG CT C TATAAATAAC AAAAC 
AGAATTAGAAACATGGAAGTTAGTAAAGACTT CTGG CATAACTCCTTTACCTATTTCTT C 
TGAAGCTAACACTGGACTAATTAGACATAAGAGAGATTTTGGTATAAGTGCAATAGTGGC 
AGCTATTGTAGCCGCTACTGCTATTGCTGCTAGCGCTACTATGTCTTATGTTGCTCTAAC 
TGAGGTTAACAAAATAATGGAAGTACAAAATCATACTTTTGAGGTAGAAAATAGTACTCT 
AAATGGTATGGATTTAATAG AACGAC AAATAAAGATATTAT ATG C TATGATTC TT CAAAC 
ACATGCAGATGTTCAACTGTTAAAGGAAAGACAACAGGTAGAGGAGACATTTAATTTAAT 
TGG ATGTATAG AAAGAACAC ATGT ATT T TGT C ATACTGGTCAT CC C TGGAATATGTC ATG 
GGGACATTTAAATGAGTCAAC AC AATGGG ATGAC TGGGTAAGC AAAATGGAAGAT TTAAA 
T CAAGAG ATAC T AACT ACAC TT C ATGGAGCCAGGAAC AATTTGG CACAATCCATGATAAC 
AT TCAATAC AC CAGATAGTATAGCT CAATTTGG AAAAG AC CTTTG G AGTC AT ATTGG AAA 
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TTGGATTCCTGGATTGGGAGCTTCCATTATAAAATATATAGTGATGTTTTTGCTTATTTA 
TTTGTTACTAACCTCTTCGCCTAAGATCCTCAGGGCCCTCTGGAAGGTGACCAGTGGTGC 
AGGGTCCTCCGGCAGTCGTTACCTGAAGAAAAAATTCCATCACAAACATGCATCGCGAGA 
AGACAC CTGGG AC CAGG CCCAAC AC AAC AT ACACCT AGCAGG CGTGACCGGTGGATCAGG 
GGACAAATACTACAAGCAGAAGTACTCCAGGAACGACTGGAATGGAGAATCAGAGGAGTA 
CAACAGGCGGCCAAAGAGCTGGGTGAAGTCAATCGAGGCATTTGGAGAGAGCTATATTTC 
CGAGAAGACCAAAGGGGAGATTTCTCAGCCTGGGGCGGCTATCAACGAGCACAAGAACGG 
CTCTGGGGGGAACAATCCTCACCAAGGGTCCTTAGACCTGGAGATTCGAAGCGAAGGAGG 
AAACATTTATGACTGTTGCATTAAAGCCCAAGAAGGAACTCTCGCTATCCCTTGCTGTGG 
ATTTCCCTTATGGCTATTTTGGGGACTAGTAATTATAGTAGGACGCATAGCAGGCTATGG 
ATTACGTGGACTCGCTGTTATAATAAGGATTTGTATTAGAGGCTTAAATTTGATATTTGA 
AATAATCAGAAAAATGCTTGATTATATTGGAAGAGCTTTAAATCCTGGCACATCTCATGT 
ATCAATGCCTCAGTATGTTTAGAAAAACAAGGGGGGAACTGTGGGGTTTTTATGAGGGGT 
TTTATAAATGATTATAAGAGTAAAAAGAAAGTTGCTGATGCTCTCATAACCTTGTATAAC 
CCAAAGG ACT AGCTCATGTTGC TAG GCAACTAAACCGCAATAACCGC ATTTGTG ACGCG A 
GTTCCCCATTGGTGACGCGTGGTACCTCTAGAGTCGACCCGGGCGGCCGCTTCCCTTTAG 
TGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGTTTGGACAAACCACA 
ACTAGAATG C AGTGAAAAAAATGCTTTATT TGTG AAATT TGTGATGCTATTG CTTTATTT 
GTAACCATTATAAGCTGCAATAAAC AAGTTAACAAC AAC AATTG CATTCAT TTTATGTTT 
CAGGTT CAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGT AAAACCTCTACAAATGTGG T 
AAAATCCGATAAGGAT CGAT C C GGGCTGG CGTAATAGCGAAGAGGCCCGC AC CGATCGC C 
CTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGCGCATTAAG 
CGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCC 
CGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGC 
TCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGGCACCTCGACCGCAA 
AAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCG 
CCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAAC 
ACTC AACC CTATCTCGGTCTATTCTTTTG ATTTATAAGGG ATTTTGCCGATTT CGGC C TA 
TTGGTTAAAAAATGAGCTGATTTAACAAATATTTAACGCGAATTTTAACAAAATATTAAC 
GTTTACAATT TCG CCTGATG CGGTATTTTCTCCTTACGCATCTGTGCGGT ATTTCACACC 
G CATACGCGGATCTGCGCAG CACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTGGTT 
AGGTACC TT C TGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTC AGTTAGGGTGTGGAAA 
GT CC CCAGG CTC C CCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAG TCAGC AAC 
CAGGTGTGGAAAGT C CCCAGGCTCC C CAGCAGGCAGAAGTATGC AAAGCATGCATCTCAA 
TTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAG 
TTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGC 
CGC CT CGGC CTCTGAGCTATTC CAGAAGTAGTGAGGAGGCTTTTTTGGAGGC CTAGGCTT 
TT G C AAAAAGCTTGATT CTTCTGACAC AAC AGT CT CGAACTTAAGGCTAGAGC CACCATG 
ATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGC 
TATGACTGGG CACAACAGAC AATCGGCTGCTCTG ATGCCG CCGTGTTCCG GCTGTCAG CG 
CAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAG 
GACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTC 
GACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGAT 
CTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGG 
CGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATC 
GAGCGAG C ACQ TAC TCGGATGG AAGCCGGTCTTGTCGATC AGGATGATCTGGAC G AAGAG 
CATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGC 
GAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGC 
CGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATA 
GCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTC 
GTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGAC 
GAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGC 
CATCACGATGGCCGCAATAAAATATCTTTATTTTCATTACATCTGTGTGTTGGTTTTTTG 
TGTGAATCGATAGCGATAAGGATCCGCGTATGGTGCACTCTCAGTACAATCTGCTCTGAT 
GCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCT 
TGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGT 
CAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTA 
TTTTTATAGGTTAATGT CATGAT AATAAT GGTTT CTT AGACGTCAGGTGGC AC TTTTCGG 
GGAAATGT G C GCGGAACC CC TATTTGTTTATTTTT CTAAATACATTCAAATATGTATC CG 
CT CATGAGACAAT AAC C CTGATAAATG CT T CAATAATATTGAAAAAGG AAGAGTATGAGT 
ATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTT 
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GCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTG 
GGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAA 
CGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATT 
GACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAG 
TACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGT 
GC TGC C ATAAC C ATGAGTGAT AAC ACTG CGG C C AAC TTACTTC TGAC AACGATCGGAGG A 
C CGAAGGAG CTAAC CGCTTTTTTGC ACAAC ATGGGGGATC ATGT AACT CGC CTTGAT CGT 
TGGGAACC G GAGCTGAATG AAGC C AT AC C AAACGACG AGCGTG AC AC C ACGATGCCTGT A 
GCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGG 
CAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCC 
CTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGCGGT 
ATCATTGCAG CAC TGGGG CC AGATGGT AAGC CCTCC CGTATCGT AGTTATCT ACACGACG 
GGGAGTC AGG CAAC TATGGATGAACGAAATAGACAG ATCG CTGAGATAG GTG CCT C ACTG 
ATTAAGC ATT GGT AACTGTC AGAC C AAGTTTACT CATATATACTT TAGATTGATTT AAAA 
CTT CATTTTTAATTTAAAAGGATCTAGGTGAAG ATC CTTTTTGAT AATCT CATGAC C AAA 
AT C C C TT AACGTGAGTT TTCGTT C C ACTGAGCGTGAGACC CCGT AGAAAAGATC AAAGGA 
TCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCG 
CTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACT 
GG CTT CAGC AGAGCGCAGATAC CAAATACTGTC CTTCTAGTGT AGC CGTAGT TAGG C CAC 
CACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTG 
GCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCG 
GATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGA 
ACGACCTACAC CGAACTGAGATACCT ACAGCGTGAGCTATGAGAAAGCGC CACG CTTCC C 
GAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACG 
AGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTC 
TGACTTGAGCGTCGATTTTTGTGATGCT C GTCAGGG GGGCGGAGCCTATGGAAAAACG CC 
AGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGGCTCGA 
C 
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TGAATAATAAAATGTGTGTTTGTCCGAAATACGCGTTTTGAGATTTCTGTCGCCGACTAA 
ATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAAAAATTG 
ATATTTGAAAATATGGCATATTGAAAATGTCGCCGATGTGAGTTTCTGTGTAACTGATAT 
CGCCATTTTTCCAAAAGTGATTTTTGGGCATACGCGATATCTGGCGATAGCGCTTATATC 
GTTTACGGGGGATGGCGATAGACGACTTTGGTGACTTGGGCGATTCTGTGTGTCGCAAAT 
ATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGGCGACAT 
CAAGCTGGC AC ATG GC C AATGCATATCGATC TATACATTGAATCAAT ATTGGC CATTAGC 
CATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATACGTT 
GTATCCATATCGTAATATGTAC ATTTATATTGGCT C ATGTCCAAC ATTAC C GC CATGTTG 
AC ATTGATT ATTGACTAGTTATT AATAGTAAT CAATTACGGGGTCATTAGTTCATAGC C C 
ATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAA 
CGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGAC 
TTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCA 
AGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTG 
GCATTATGC C CAGTACATGACC TTACGGGACTTTCC TACTTGGCAGTACATCTACGTATT 
AGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGGATAGCG 
GTTTGACTC ACGGGGATTTC C AAGTCT C C ACCCGATTGACGTCAATGGGAGTTTGTTTTG 
GCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTGCGATCGCCCGCCCCGTTG 
ACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTG 
AAC CGTCAGATCAC TAGAAGC TTTATTGCGGT AGT TTATC ACAGT TAAATTG CTAACGCA 
GTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGTGACTCTCTTAAGGTAGCC 
TTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAAGGTTACAAGACAGGTTTA 
AGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACTCTTGCGTTTCTGATAGGC 
ACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGTT 
CAATTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACTATAGGCTAGTAACGGCCG 
CCAGTGTGCTGGAATTCGGCTTATGGCAGAATCGAAGGAAGCAAGAGACCAAGAAATGAA 
CCTGAAAGAAGAAT CTAAAGAAGAAAAAAGAAGAAATGACTGGTGGAAAATAGAT C CTCA 
GGGCCCTCTGGAAGGTGACCAGTGGTGCAGGGTCCTCCGGCAGTCGTTACCTGAAGAAAA 
AATTCCATCACAAACATGCATCGCGAGAAGACACCTGGGACCAGGCCCAACACAACATAC 
ACCTAGCAGGCGTGACCGGTGGATCAGGGGACAAATACTACAAGCAGAAGTACTCCAGGA 
ACGACTGGAATGGAGAATCAGAGGAGTACAACAGGCGGCCAAAGAGCTGGGTGAAGTCAA 
TCG AGG C ATTTGGAGAG AG CTATATTTC CGAGAAGACCAAAGGGGAGAT TTCTCAG CCTG 
GGGCGGCTATCAACGAGCACAAGAACGGCTCTGGGGGGAACAATCCTCACCAAGGGTCCT 
TAGACCTGGAGATT CGAAGCGAAGGAGGAAAC ATT TATGAAGCCGAAT TCTGCAGATATC 
CAT C AC ACTGG CGG C CGC TTCCCTTTAGTGAGGGTTAATGCTTCGAGCAG ACATGATAAG 
ATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTG 
TGAAATTTGTGATGCTATTGCTTTATTTGTAAC C ATTATAAGC TGCAAT AAACAAGTTAA 
CAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTA 
AAGCAAGTAAAAC CT C TACAAATGTGGTAAAATC CGATAAGG ATCGATC CGGG CTGGCGT 
AAT AGCG AAGAGGC CCGCAC CG AT CGCC CTTC CC AACAGTTGC G C AG CCTG AATGG CG AA 
TGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGA 
CCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCG 
CCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGAT 
TTAGAGCTTTACGGCACCTCGACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTG 
GGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATA 
GTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATT 
TATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATAT 
TTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTC 
CTTACGCATCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTG 
AAATAACCTCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGT 
GGAATGTGTGT C AGTT AGGGTGTGGAAAGTC C C CAG GCTCC CCAGCAGG CAGAAGTATG C 
AAAG CATG CATCTCAATTAGTCAG C AAC C AGGTGTGGAAAG TC CC CAG GCTCCCC AGCAG 
G CAGAAGTATG C AAAG C ATGCATC TC AAT TAGTC AGC AACC AT AGTCCCGCCCCTAAC T C 
CGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAA 
TTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGT 
GAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAG 
TCTCGAACTTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCC 
GGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTC 
TGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGA 
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CCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCAC 

GACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCT 

GCTATTGGGCGAAGTGCCGGGGCAGGATCTCGTGTCATCTCACCTTGCTCCTGCCGAGAA 

AGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCC 

ATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCT 

TGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGC 

CAGGCTCAAGGCGCGCATGCCCGAGGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTG 

CTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCT 

GGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCT 

TGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCA 

GCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAA 

ATGAC CGACCAAGCGACGCCCAAC CTGCCATC ACGATGGC CG C AATAAAATATCTTTATT 

TTCATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCGATAAGGATCCGCGTATG 

GTGCAC TCT CAGTACAATCTG CT CTGATGC CG C ATAGTTAAGCCAG C C C CGACAC CCGCC 

AACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGC 

TGTGAC CGTCT CCGGGAGCTGCATGTGTC AGAGGTTTT C AC CGT CAT CAC CGAAACG CGC 

GAGACGAAAGGGCCTCGTGA.TACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGT 

TTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATT 

TTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCA 

ATAATATTG AAAAAGGAAGAGTATGAGTATTC AACAT TTC CGTGTCG C CCTTATT CCCTT 

TTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGA 

TGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAA 

GATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCT 

GCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCAT 

ACACTATTC TCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAG CATC TTACGGA 

TGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGC 

CAACTTACTTCTGAC AACGATCGGAGGAC CGAAGGAGCTAAC CG C TTT TTTG CAC AACAT 

GGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAA 

CGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAAC 

TGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAA 

AGTTGCAGGACCAdTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATC 

TGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCC 

CTCC CGTATCGTAGT TATCTAGACGACGGGGAGTCAGGCAACT ATGGATGAACGAAATAG 

ACAGAT CG CTGAG ATAGGTG CCTCACTG ATT AAGCATTGGT AACTGTCAGAC CAAGTTTA 

CTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAA 

GATCCTTTTTGATAATCTCATGACCAAAATC CCT TAACGTGAGTTTTCGTTCCACTGAGC 

GTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAAT 

CTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGA 

GCTACCAACTCTTTTTCCGAAGGTAACTG GCTTCAGC AGAGCGCAGATAC CAAATACTGT 

CCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATA 

CCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTAC 

CGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGG 

TTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCG 

TGAGCT ATGAGAAAG CGC CACGC TT CC CGAAGGGAGAAAGG CGGACAGGTATCC GGTAAG 

CGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCT 

TTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTC 

AGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTT 

TTGCTGGCCTTTTGCTCACATGGCTCGACAGATCT 
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SEQ ID No. 1 4 - pESYNREV 

TC AATATTGG C C ATTAGC CATATT ATT CATTGGTTATATAG CATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGGTCATGTCC 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

C C ACTTGG C AGTACAT C AAGTGT ATCATATGC C AAGTCCGCC C C CTATTGACGTC AATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGT ACATC T ACGTATT AG TCAT CG CTATT AC CATGGTGATGCGGTTTTGGC AGTAC AC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AG C AG AG CTCGTTTAGTGAAC CGT CAGAT CACT AGAAGCTTTAT TGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

C TTGCGTTTCTGATAGGC AC C TATTG GTCTTACTGACAT C CACTTTGC CTTT CTCTC C AC 

AGGTGT CC AC TC C C AGTTC AATTACAG CTCTTAAGGCTAGAGTACTTAAT ACGACT CACT 

ATAGGCTAGC CTCGAGAATTCGCC AC CATGGCTGAGAGCAAGGAGG CC AGGGATC AAGAG 

ATGAAC CT CAAGGAAGAGAGC AAAGAGGAGAAGCG CCGC AACGACT G GTGGAAGATCGAC 

CCACAAGGCCCCCTGGAGGGGGACCAGTGGTGCCGCGTGCTGAGACAGTCCCTGCCCGAG 

GAGAAGATTCCTAG C CAGACCTGCATC GCCAGAAGACACCT CGGCC C CGGT CC CAC CC AG 

CACAC AC C CTC C AG AAGGGATAGGTGGATTAGGGGC CAGATTTTG C AAGC CGAGGT CCTC 

CAAGAAAGGCTGGAATGGAGAATTAGGGGCGTGCAACAAGCCGCTAAAGAGCTGGGAGAG 

GTGAATCGCGGCATCTGGAGGGAGCTCTACTTCCGCGAGGACCAGAGGGGCGATTTCTCC 

GC ATG GGGAGGCTACC AG AGGGC ACAAGAAAGGCTGT GGGG CGAGCAGAGC AGC C C C CGC 

GTCTTGAGGCCCGGAGACTCCAAAAGACGCCGCAAACACCTGTGAAGTCGACCCGGGCGG 

CCGCTTCCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGT 

TTGGACAAAC C ACAACT AGAATG CAGTGA7U\AAAATGCTTT ATTTGTGAAATTTGTGATG 

CTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCA 

TTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGTAA7VACC 

TCTACAAATGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTAATAGCGAAGAGGC 

CCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGT 

AGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCC 

AGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGC 

TTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGG 

CACCTCGACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGA 

TAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTC 

CAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTG 

C CG ATTTCGGC C TATTGGTTAAAAAATG AGC TG ATTTAAC AAATATTTAACG CG AAT TTT 

AACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTG 

CGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACCTCTGAA 

AGAGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAG 

T TAGGGTGTGGAAAGTC CC C AGGC TCC CC AGC AGGC AGAAGTAT GC AAAG CATGCAT CTC 

AAT TAGT C AG CAACC AGGTGTGGAAAGTCC CC AGG CT C C CCAGC AGGC AGAAGTATGCAA 

AGC ATGCAT C TCAATTAGT C AGCAACC ATAGT C C CGC C C CTAACTC CGCC C ATCC CGCC C 

GCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTT 
GGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCTCGAACTTAAGG 
CTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGG 
AGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGT 
TCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCC 
TGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTT 
GCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAG 
TGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGG 
CTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAG 
CGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATG 
ATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGC 
GCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCA 

39 



WO 01/79518 



PCT/GBO 1/01 784 



TGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACC 
GCTATCAGG ACATAGCGTTGG CT AC C CGTGATAT TGCTGAAGAGCTTGG CGG CG AATGGG 
CTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCT 
ATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGC 
GAC G C CC AAC CTGCC ATCACG ATGGCC G C AAT AAAATATCTTTATTTT CAT TACAT C TGT 

CAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACG 
CGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCG 
GGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCC 
T CGTGATAC GC CT ATT TTTATAGGTTAATG TC ATGAT AATAATGGT T TCTT AG AC GTC AG 
GTGGCACTTTTCGGGGAAATGTC XUGGAAC C C CTATTTGTTT ATT TTT CTAAATACATT 
CAAATATGTATCCGCTCATGAGAwATAACCCTGATAAATGCTTCAATAATATTGAAAAA 
GGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTT 
GC CTTC CTGTT TTTGCTC AC CC AGAAACGC TGGTGAAAGTAAAAGATGC TGAAGAT CAGT 
TGGGTGC ACGAGTGGGTT AC ATC GAACTGGATCTC AAC AGCGGTAAGAT C CTTGAGAGTT 
TTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGG 
TATTATC C CGTATTG ACGC CGGGCAAGAGC AACTCGGTCGC CGCAT ACACT ATT CTC AGA 
ATGACT TGGTTGAGTACTCAC CAGTC ACAGAAAAG CATCTT ACGGATGGCATGAC AGTAA 
GAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGA 
C AACGATCGG AGGAC CGAAGGAG CTAAC CGC TTTTTTGC ACAAC ATGGGGGATC ATGTAA 
CTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACA 
C C ACGATG CCTGTAGC AATGG C AAC AACGTTG CGCAAACTATTAACTGGCGAACTAC TTA 
CTCTAGCTT C CC GGCAACAATTAAT AGACTGG ATGGAGGCGGATAAAGTTG CAGGAC C AC 
TTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGC 
GTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAG 
TTATC TACACGACGGGGAGTCAGGCAACTATG GATGAACGAAAT AG ACAGATC GCTGAGA 
TAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTT 
AGATTGATTTAAAAC TT CATTTTTAATTT AAAAGGATCT AG GTG AAGATC CTTTTTGAT A 
ATCTC ATGAC CAAAAT C C CTTAACGTGAGTTTTCGTTC C ACTGAGCGTCAGAC CCCGTAG 
AAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAA 
CAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTT 
TTC CGAAGGTAACTGGCTTCAGCAGAG CGC AGATAC C AAATACTGT CCTTCT AGTGTAGC 
CGT AGTTAGGC CAC C ACTTCAAGAACTCTGTAG CAC CGCCTACATACCT CGCT C TGCTAA 
TCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAA 
GACGATAGTTAC CGGATAAGGCGCAGCGGT CGGGC TGAACGGGGGGTT CGTGCAC ACAGC 
C C AGCTTGGAGCG AAC GACCT ACAC CGAAC TGAGATACCT ACAGCGTGAGCTATGAGAAA 
GCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAA 
CAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCG 
GGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCC 
TATGGAAAi\ACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTG 
CT C ACATGGCTCG ACAGAT CT 
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Seq ID No: 15 codon optimised HIV gag-pol 

ATGGGCGCCCGCGCCAGCGTGCTGTCGGGCGGCGAGCTGGACCGCTGGGAGAAGATCCGC 

CTGCGCCCCGGCGGCAAAAAGAAGTACAAGCTGAAGCACATCGTGTGGGCCAGCCGCGAA 

CTGGAGCGCTTCGCCGTGAACCCCGGGCTCCTGGAGACCAGCGAGGGGTGCCGCCAGATC 

CTCGGCCAACTGCAGCCCAGCCTGCAAACCGGCAGCGAGGAGCTGCGCAGCCTGTACAAC 

ACCGTGGCCACGCTGTACTGCGTCCACCAGCGCATCGAAATCAAGGATACGAAAGAGGCC 

CTGGATAAAATCGAAGAGGAACAGAATAAGAGCAAAAAGAAGGCCCAACAGGCCGCCGCG 

GACACCGGACACAGCAACCAGGTCAGCCAGAACTACCCCATCGTGCAGAACATCCAGGGG 

CAGATGGTGCACCAGGCCATCTCCCCCCGCACGCTGAACGCCTGGGTGAAGGTGGTGGAA 

GAGAAGGCTTTTAGCCCGGAGGTGATACCCATGTTCTCAGCCCTGTCAGAGGGAGGCACC 

CCCCAAGATCTGAACACCATGCTCAACACAGTGGGGGGACACCAGGCCGCCATGCAGATG 

CTGAAGGAGACCATCAATGAGGAGGCTGCCGAATGGGATCGTGTGCATCCGGTGCACGCA 

GGGCCCATCGCACCGGGCCAGATGCGTGAGCCACGGGGCTCAGACATCGCCGGAACGACT 

AGTACCCTTCAGGAACAGATCGGCTGGATGACCAACAACCCACCCATCCCGGTGGGAGAA 

ATCTACAAACGCTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGTATAGCCCTACC 

AGCATCCTGGACATCCGCCAAGGCCCGAAGGAACCCTTTCGCGACTACGTGGACCGGTTC 

TACAAAACGCTCCGCGCCGAGCAGGCTAGCCAGGAGGTGAAGAACTGGATGACCGAAACC 

CTGCTGGTCCAGAACGCGAACCCGGACTGCAAGACGATCCTGAAGGCCCTGGGCCCAGCG 

GCTACCCTAGAGGAAATGATGACCGCCTGTCAGGGAGTGGGCGGACCCGGCCACAAGGCA 

CGCGTCCTGGCTGAGGCCATGAGCCAGGTGACCAACTCCGCTACCATCATGATGCAGCGC 

GGCAACTTTCGGAACCAACGCAAGATCGTCAAGTGCTtCAACTGTGGCAAAGAAGGGCAC 

AC AGCCCGC AACTG C AGGGCCC CT AGGAAAAAG GGCT GTTG G AAATGTG G AAAGG AAGGA 

CACGAAATGAAAGATTGTACTGAGAGACAGGCTAATTTTTTAGGGAAGATCTGGCCTTCC 

CACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAACAGCCCCACCAGAA 

GAGAGCTTCAGGTTTGGGGAAGAGACAACAACTCCCTCTCAGAAGCAGGAGCCGATAGAC 

AAGGAACTGTATCCTTTAGCTTCCCTGAGATCACTCTTTGGCAGCGACCCCTCGTCACAA 

T AAAGAT AGGGGGGC AGCTC AAGG AG GC TCT CCTGGAC ACCGGAGC AGACG AC AC CGT GC 

TGGAGGAGATGTCGTTGCCAGGCCGCTGGAAGCCGAAGATGATCGGGGGAATCGGCGGTT 

TCATCAAGGTGCGCCAGTATGACCAGATCCTCATCGAAATCTGCGGCCACAAGGCTATCG 

GTACCGTGCTGGTGGGCCCCACACCCGTCAACATCATCGGACGCAACCTGTTGACGCAGA 

TCGGTTGCACGCTGAACTTCCCCATTAGCCCTATCGAGACGGTACCGGTGAAGCTGAAGC 

CCGGGATGGACGGCCCGAAGGTCAAGCAATGGCCATTGACAGAGGAGAAGATCAAGGCAC 

TGGTGGAGATTTGCACAGAGATGGAAAAGGAAGGGAAAATCTCCAAGATTGGGCCTGAGA 

ACCCGTACAACACGCCGGTGTTCGCAATCAAGAAGAAGGACTCGACGAAATGGCGCAAGC 

TGGTGGACTTCCGCGAGCTGAACAAGCGCACGCAAGACTTCTGGGAGGTTCAGCTGGGCA 

TCCCGCACCCCGCAGGGCTGAAGAAGAAGAAATCCGTGACCGTACTGGATGTGGGTGATG 

CCTACTTCTCCGTTCCCCTGGACGAAGACTTCAGGAAGTACACTGCCTTCACAATCCCtT 

CGATCAACAACGAGACACCGGGGATTCGATATCAGTACAACGTGCTGCCCCAGGGCTGGA 

AAGGCTCTCCCGCAATCTTCCAGAGTAGCATGACCAAAATCCTGGAGCCTTTCCGCAAAC 

AGAACCCCGACATCGTCATCTATCAGTACATGGATGACTTGTACGTGGGCTCTGATCTAG 

AGATAGGGCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGTTGAGGTGGGGAC 

TGACCACACCCGACAAGAAGCACCAGAAGGAGCCTCCCTTCCTCTGGATGGGTTACGAGC 

TGCACCCTGACAAATGGACCGTGCAGCCTATCGTGCTGCCAGAGAAAGACAGCTGGACTG 

TCAACGACATACAGAAGCTGGTGGGGAAGTTGAACTGGGCCAGTCAGATTTACCCAGGGA 

TTAAGGTGAGGCAGCTGTGCAAACTCCTCCGCGGAACCAAGGCACTCACAGAGGTGATCC 

CCCTAACCGAGGAGGCCGAGCTCGAACTGGCAGAAAACCGAGAGATCCTAAAGGAGCCCG 

TGCACGGCGTGTACTATGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGGC 

AAGGCCAGTGGACCTATCAGATTTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGT 

ACGCCCGGATGAGGGGTGCCCACACTAACGACGTCAAGCAGCTGACCGAGGCCGTGCAGA 

AG AT C ACC ACCGAAAGC ATCGTGATCTGGGG AAAGACT C C T AAGTTC AAGCTGC C C AT CC 

AGAAGGAAACCTGGGAAACCTGGTGGACAGAGTATTGGCAGGCCACCTGGATTCCTGAGT 

GGGAGTTCGTCAACACCCCTCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCA 

TAGTGGGCGCCGAAACCTTCTACGTGGATGGGGCCGCTAACAGGGAGACTAAGCTGGGCA 

AAGCCGGATACGTCACTAACCGGGGCAGACAGAAGGTTGTCACCCTCACTGACACCACCA 

ACCAGAAGACTGAGCTGCAGGCCATTTACCTCGCTTTGCAGGACTCGGGCCTGGAGGTGA 

ACATCGTGACAGACTCTCAGTATGCCCTGGGCATCATTCAAGCCCAGCCAGACCAGAGTG 

AGTCCGAGCTGGTCAATCAGATCATCGAGCAGCTGATCAAGAAGGAAAAGGTCTATCTGG 

CCTGGGTACCCGCCCACAAAGGCATTGGCGGCAATGAGCAGGTCGACAAGCTGGTCTCGG 

CTGGCATCAGGAAGGTGCTATTCCTGGATGGCATCGACAAGGCCCAGGACGAGCACGAGA 

AATACCACAGCAACTGGCGGGCCATGGCTAGCGACTTCAACCTGCCCCCTGTGGTGGCCA 

AAGAGATCGTGGCCAGCTGTGACAAGTGTCAGCTCAAGGGCGAAGCCATGCATGGCCAGG 

TGGACTGTAGCCCCGGCATCTGGCAACTCGATTGCACCCATCTGGAGGGCAAGGTTATCC 

TGGTAGCCGTCCATGTGGCCAGTGGCTACATCGAGGCCGAGGTCATTCCCGCCGAAACAG 

GGCAGGAGACAGCCTACTTCCTCCTGAAGCTGGCAGGCCGGTGGCCAGTGAAGACCATCC 

ATACTGACAATGGCAGCAATTTCACCAGTGCTACGGTTAAGGCCGCCTGCTGGTGGGCGG 

GAATCAAGCAGGAGTTCGGGATCCCCTACAATCCCCAGAGTCAGGGCGTCGTCGAGTCTA 

TGAATAAGGAGTTAAAGAAGATTATCGGCCAGGTCAGAGATCAGGCTGAGCATCTCAAGA 

CCGCGGTCCAAATGGCGGTATTCATCCACAATTTCAAGCGGAAGGGGGGGATTGGGGGGT 

ACAGTGCGGGGGAGCGGATCGTGGACATCATCGCGACCGACATCCAGACTAAGGAGCTGC 

AAAAGCAGATTACCAAGATTCAGAATTTCCGGGTCTACTACAGGGACAGCAGAAATCCCC 

TCTGGAAAGGCCCAGCGAAGCTCCTCTGGAAGGGTGAGGGGGCAGTAGTGATCCAGGATA 

ATAGCGACATCAAGGTGGTGCCCAGAAGAAAGGCGAAGATCATTAGGGATTATGGCAAAC 

AGATGGCGGGTGATGATTGCGTGGCGAGCAGACAGGATGAGGATTAG 
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Seq ID No: 16 codon optimised EIAV gag-pol 

ATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGGAAAAAGTCACCGTTCAG 
GGT AGCCAAAAGCT T ACC AC AGGC AATT GC AACTGGGC AT T GTCCCTGGT GG ATCTTTTC 
CACGACACTAATTTCGTTAAGGAGAAAGATTGGCAACTCAGAGACGTGATCCCCCTCTTG 
GAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGAAGCTTTCGAGCGCACCTGGTGG 
GCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGGTTGACGGTAAAGCTAGC 
TTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGCCAACAAGAAACAATCCGAACCT 
AGCGAGGAGTACCCAATTATGATCGACGGCGCCGGCAATAGGAACTTCCGCCCACTGACT 
CCCAGGGGCTATACCACCTGGGTCAACACCATCCAGACAAACGGACTTTTGAACGAAGCC 
TCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCGAAGAAATGAATGCTTTT 
CTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGATCCTGCTCGATGCCATTGACAAG 
ATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAACGCCCCTCTGGTGGCTCCCCCA 
CAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGGGGGTGCCCCGCGAACGC 
CAGATGGAGCCAGCATTTGACCAATTTAGGCAGACCTACAGACAGTGGATCATCGAAGCC 
ATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCACAGAACATCAGGCAGGGG 
GCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCTGTCCCAGATTAAATCCGAAGGC 
CACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACTGACTATCCAAAATGCAAATGAA 
GAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAAGATACCCTGGAGGAGAAAATGTAC 
GCATGTCGCGACATTGGCACTACCAAGCAAAAGATGATGCTGCTCGCCAAGGCTCTGCAA 
ACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAAGGGAGGTCCATTGAAAGCTGCA 
C AAAC ATGTT AT AATTGT GG G AAGCC AGG AC ATTT ATCT AGTC AAT GT AG AGC AC C T AAA 

GTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCAAAGCAATGCAGAAGTGTTCCAAAA 
AACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGATACAACAGAAG 
AGTCAGCACAACAAATCTGTTGTACAAGAGACTCCTCAGACTCAAAATCTGTACCCAGAT 
CTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAAGGATCAAGTAGAGGATCTCAAC 
CTGGACAGTTTGTGGGAGTAACATACAATCTCGAGAAGAGGCCCACTACCATCGTCCTGA 
TCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGAGCCGACACCAGCGTTCTCACTA 
CTGCTCACTATAACAGACTGAAATACAGAGGAAGGAAATACCAGGGCACAGGCATCATCG 
GCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACCATCAAAAAGAAGGGGAGAC 
ACATTAAAACCAGAATGCTGGTCGCCGACATCCCCGTCACCATCCTTGGCAGAGACATTC 
TCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAAGGAAATCAAGTTCCGCA 
AGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATCCCCCAGTGGCCCCTGACCAAAG 
AGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTGCTTTCTGAGGGCAAGATTAGCG 
AGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGATTAAGAAAAGGAGCGGCA 
AATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAGACCGTCCAGGTCGGAACTGAGA 
TCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAAGCACATGACAGTCCTTG 
.ACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCTGAATTTCGCCCCTATACTGCTT 
TTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTATGTGTGGAAGTGCCTCC 
CCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAGACACTTCAAGAGATCCTCCAAC 
CTTTCCGCGAAAGATACCCAGAGGTTCAACTCTACCAATATATGGACGACCTGTTCATGG 
GGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGAACTGAGGGCAATCCTCC 
TGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGTTCCTCCATATAGCTGGC 
TGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAGAAGATGCAGTTGGATATGGTCA 
AGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATATTACCTGGATGAGCTCCG 
GAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAAAGGATGCCTGGAGTTGA 
ACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAACTGGAGGAGAATAATGAAAAGA 
TTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGAAATGTTGTGCGAGGTCG 
AAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTCCCAAGGCATCTTGTGGG 
CCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGTTAAAAATCTGATGCTCC 
TGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTCGGCAAGTGCCCCACCTTCAAAG 
TTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGGCTGGTACTACTCTTGGC 
TTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGACGACTGGAGAATGAAGCTTGTCG 
AGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAAGCAAAACGGAGAGGGAA 
TCGCTGCATACGTCACATCTAACGGCCGCACCAAGCAAAAGAGGCTCGGCCCTGTCACTC 
ACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGACACTAGAGACAAGCAGG 
TGAACATTGTGACTGACAGCTACTACTGCTGGAAAAACATCACAGAGGGCCTTGGCCTGG 
AGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCGCGAAAAGGAAATTGTCT 
ATTTCGCCTGGGTGCCT GG AC AC AAAGG AAT T T AC GGC AACC AACTCGCCG ATG AAGC CG 
CCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGCACACAGATTAAGGAGAAGAGAG 
ACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACGACATCATGATTCCCGTTAGCGACA 
CAAAGATCATTCCAACCGATGTCAAGATCCAGGTGCCACCCAATTCATTTGGTTGGGTGA 
CCGG AAAGTCC AGC ATGGCT AAGC AGG GTCTTCTG ATT AACGGGGGAATCATTGATGAAG 
GATACACCGGCGAAATCCAGGTGATCTGCACAAATATCGGCAAAAGCAATATTAAGCTTA 
TCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCCAGCACCACAGCAATTCAAGACAAC 
CTTGGGACGAAAACAAGATTAGCCAGAGAGGTGACAAGGGCTTCGGCAGCACAGGTGTGT 
TCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGCACGAGAATTGGCACACCTCCCCTA 
AGATTTTGGCCCGCAATTACAAGATCCCACTGACTGTGGCTAAGCAGATCACACAGGAAT 
GCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCGGCTGCGTGATGAGGTCCCCCAATC 
ACTGGCAGGCAGATTGCACCCACCTCGACAACAAAATTATCCTGACCTTCGTGGAGAGCA 
ATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGCATTGTGCACCTCCCTCG 
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CAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCCCTGCACACCGACAACGGCACCA 
ACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGTTCCTGAAAATCGCCCACACCACTG 
GCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAGAGGGCCAACAGAACTCTGAAAG 
AAAAGATCCAAT CTC AC AGAGAC AAT ACACAGAC ATTGG AGGCCGC AC TTC AGCTC GCCC 
TTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGCCAGACCCCCTGGGAGGTCTTCA 
TCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTCTTGCAGCAGGCCCAGTCCTCCA 
AAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGACTGGAAAGGTCCTACAAGAGTTT 
TGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATGAGGGCAAGGGGATCATCGCTGTGC 
CCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGA 
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SEQ ID NO: 17 
pIRESIhyg ESYNGP 



AATTCGCCACCATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGGAAAAAG 

TCACCGTTCAGGGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGGGCATTGTCCCTGG 

TGGATCTTTTCCACGACACTAATTTCGTTAAGGAGAAAGATTGGCAACTCAGAGACGTGA 

TCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGAAGCTTTCGAGC 

GCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGGTTGACG 

GTAAAGCTAGCTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGCCAACAAGAAAC 

AATCCGAACCTAGCGAGGAGTACCCAATTATGATCGACGGCGCCGGCAATAGGAACTTCC 

GCCCACTGACTCCCAGGGGCTATACCACCTGGGTCAACACCATCCAGACAAACGGACTTT 

TGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCGAAGAAA 

TGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGATCCTGCTCGATG 

CCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAACGCCCCTCTGG 

TGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGGGGGTGC 

CCCGCGAACGCCAGATGGAGCCAGCATTTGACCAATTTAGGCAGACCTACAGACAGTGGA 

TCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCACAGAACA 

TCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCTGTCCCAGATTA 

AATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACTGACTATCCAAA 

ATGCAAATGAAGAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAAGATACCCTGGAGG 

AGAAAATGTACGCATGTCGCGACATTGGCACTACCAAGCAAAAGATGATGCTGCTCGCCA 

AGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAAGGGAGGTCCAT 

TGAAAGCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTATCTAGTCAATGTA 

GAGCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCAAAGCAATGCAGAA 

GTGTTCCAAAAAACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGA 

TACAACAGAAGAGTCAGCACAACAAATCTGTTGTACAAGAGACTCCTCAGACTCAAAATC 

TG TACCCAG ATC TGAGCGAAATAAAAAAGGAATACAAT G TCAAGGAGAAGGATCAAGTAG 

AGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACAATCTCGAGAAGAGGCCCACTAC 

CATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGAGCCGACACCAG 

CGT TCTCAC TACT GC T CACTATAACAGAC TGAAATACAGAGGAAGGAAATACCAGGGCAC 

AGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACCATCAAAAA 

GAAGGGGAGACACATTAAAACCAGAATGCTGGTCGCCGACATCCCCGTCACCATCCTTGG 

CAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAAGGAAAT 

CAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATCCCCCAGTGGCC 

CCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTGCTTTCTGAGGG 

CAAGATTAGCGAGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGATTAAGAA 

AAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAGACCGTCCAGGT 

CGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAAGCACAT 

GACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCTGAATTTCGCCC 

CTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTATGTGTG 

GAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACGAGAAGACACTTCAAGA 

GATCCTCCAACCTTTCCGCGAAAGATACCCAGAGGTTCAACTCTACCAATATATGGACGA 

CCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGAACTGAG 

GGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGTTCCTCC 

ATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAGAAGATGCAGTT 

GGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATATTACCTG 

GATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAAAGGATG 

CCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAACTGGAGGAGAA 

TAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGAAATGTT 

GTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTCCCAAGG 

CATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGTTAAAAA 

TCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTCGGCAAGTGCCC 

CACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGGCTGGTA 

CTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGACGACTGGAGAAT 

GAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAAGCAAAA 

CGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGCAAAAGAGGCTCGG 

CCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGACACTAG 
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AGACAAGCAGGTGAACATTGTGACTGACAGCTACTACTGCTGGAAAAACATCACAGAGGG 

CCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCGCGAAAA 

GGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGCAACCAACTCGC 

CGATGAAGCCGCCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGCACACAGATTAA 

GGAGAAGAGAGACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACGACATCATGATTCC 

CGTTAGCGACACAAAGATCATTCCAACCGATGTCAAGATCCAGGTGCCACCCAATTCATT 

TGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATTAACGGGGGAAT 

CATTGATGAAGGATACACGGGCGAAATCCAGGTGATCTGCACAAATATCGGCAAAAGCAA 

TATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCCAGCACCACAGCAA 

TTCAAGACAACCTTGGGACGAAAACAAGATTAGCCAGAGAGGTGACAAGGGCTTCGGCAG 

CACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGCACGAGAATTGGCA 

CACCTCCCCTAAGATTTTGGCCCGCAATTACAAGATCCCACTGACTGTGGCTAAGCAGAT 

CACACAGGAATGCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCGGCTGCGTGATGAG 

GTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGACAACAAAATTATCCTGACCTT 

CGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGCATTGTG 

CACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCCCTGCACACCGA 

CAACGGCACCAACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGTTCCTGAAAATCGC 

CCACACCACTGGCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAGAGGGCCAACAG 

AACTCTGAAAGAAAAGATCCAATCTCACAGAGACAATACACAGACATTGGAGGCCGCACT 

TCAGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGCCAGACCCCCTG 

GGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTCTTGCAGCAGGC 

CCAGTCCTCCAAA/U^GTTCTGCTTTTATAAGATCCCCGGTGAGCACGACTGGAAAGGTCC 

TACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATGAGGGCAAGGGGAT 

CATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGAACCCGGGGCGG 

CCGCACTAGAGGAATTCGCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCG 

CTTGGAATAAGGCCGGTGTGTGTTTGTCTATATGTGATTTTCCACCATATTGCCGTCTTT 

TGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCT 

TTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCT 

GGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCC 

ACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGC 

GGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTC 

CTCAAGCGTAGTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGAAT 

CTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAGCTCT 

AGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAGCTTGCCA 

CAACCCCGTACCAAAGATGGATAGATCCGGAAAGCCTGAACTCACCGCGACGTCTGTCGA 

GAAGTTTCTGATCGAAAAGTTCGACAGCGTCTCCGACCTGATGCAGCTCTCGGAGGGCGA 

AGAATCTCGTGCTTTCAGCTTCGATGTAGGAGGGCGTGGATATGTCCTGCGGGTAAATAG 

CTGCGCCGATGGTTTCTACAAAGATCGTTATGTTTATCGGCACTTTGCATCGGCCGCGCT 

CCCGATTCCGGAAGTGCTTGACATTGGGGAATTCAGCGAGAGCCTGACCTATTGCATCTC 

CCGCCGTGCACAGGGTGTCACGTTGCAAGACCTGCCTGAAACCGAACTGCCCGCTGTTCT 

GCAGCCGGTCGCGGAGGCCATGGATGCGATCGCTGCGGCCGATCTTAGCCAGACGAGCGG 

GTTCGGCCCATTCGGACCGCAAGGAATCGGTCAATACACTACATGGCGTGATTTCATATG 

CGCGATTGCTGATCCCCATGTGTATCACTGGCAAACTGTGATGGACGACACCGTCAGTGC 

GTCCGTCGCGCAGGCTCTCGATGAGCTGATGCTTTGGGCCGAGGACTGCCCCGAAGTCCG 

GCACCTCGTGCACGCGGATTTCGGCTCCAACAATGTCCTGACGGACAATGGCCGCATAAC 

AGCGGTCATTGACTGGAGCGAGGCGATGTTCGGGGATTCCCAATACGAGGTCGCCAACAT 

CTTCTTCTGGAGGCCGTGGTTGGCTTGTATGGAGCAGCAGACGCGCTACTTCGAGCGGAG 

GCATCCGGAGCTTGCAGGATCGCCGCGGCTCCGGGCGTATATGCTCCGCATTGGTCTTGA 

CCAACTCTATCAGAGCTTGGTTGACGGCAATTTCGATGATGCAGCTTGGGCGCAGGGTCG 

ATGCGACGCAATCGTCCGATCCGGAGCCGGGACTGTCGGGCGTACACAAATCGCCCGCAG 

AAGCGCGGCCGTCTGGACCGATGGCTGTGTAGAAGTACTCGCCGATAGTGGAAACCGACG 

CCCCAGCACTCGTCCGAGGGCAAAGGAATAGAGTAGATGCCGACCGAACAAGAGCTGATT 

TCGAGAACGCCTCAGCCAGCAACTCGCGCGAGCCTAGCAAGGCAAATGCGAGAGAACGGC 

CTTACGCTTGGTGGCACAGTTCTCGTCCACAGTTCGCTAAGCTCGCTCGGCTGGGTCGCG 

GGAGGGCCGGTCGCAGTGATTCAGGCCCTTCTGGATTGTGTTGGTCCCCAGGGCACGATT 

GTCATGCCCACGCACTCGGGTGATCTGACTGATCCCGCAGATTGGAGATCGCCGCCCGTG 

CCTGCCGATTGGGTGCAGATCTAGAGCTCGCTGATCAGCCTCGACTGTGCCTCTAGTTGC 

CAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCC 

ACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCT 

ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGG 

CATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCG 
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AGTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACC 
GTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTG 
TTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGG 
TGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTC 
GGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTT 
GCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCT 
GCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGA 
TAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGC 
CGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACG 
CTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGG 
AAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTT 
TCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGT 
GTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCGCGACCGCTG 
CGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACT 
GGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTT 
CTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCT 
GCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCAC 
CGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA71AAGGATC 
TCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACG 
TTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTA7\ATTA 
AAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCA 
ATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGC 
CTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGC 
TGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCC 
AGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTAT 
TAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGGAACGTTGT 
TGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTC 
CGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAG 
CTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGT 
TATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGAC 
TGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTG 
CCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCAT 
TGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTC 
GATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTC 
TGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAA 
ATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTG 
TCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCG 
CACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCTCCCGATCCCCTA 
TGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCT 
GCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAA 
GGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGC 
GATGTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCA 
ATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTA 
AATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTAT 
GTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGG 
TAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGAC 
GTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTT 
CCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGG 
CAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCC 
ATTGACGTCAATGGGAGTTTGTTTTGGCACCA7VAATCAACGGGACTTTCCAAAATGTCGT 
AACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 
AGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACG 
ACTCACTATAGGGAGACCCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCA 
GTGTGCTGGAATTAATTCGCTGTCTGCGAGGGCCAGCTGTTGGGGTGAGTACTCCCTCTC 
AAAAGCGGGCATGACTTCTGCGCTAAGATTGTCAGTTTCCAAAAACGAGGAGGATTTGAT 
ATTCACCTGGCCCGCGGTGATGCCTTTGAGGGTGGCCGCGTCCATCTGGTCAGAAAAGAC 
AATCTTTTTGTTGTCAAGCTTGAGGTGTGGCAGGCTTGAGATCTGGCCATACACTTGAGT 
GACAATGACATCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGGTCCAACTGCAG 
GTCGATCGAGCA 
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SEQ ID NO: 18 



pESDS'YNGP 



TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGGAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 

AGGTGTCCACTCCCAGTTCAATTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACT 

ATAGGCTAGAGAATTCCAGGTAAGATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAG 

AAACTGGAAAAAGTCACCGTTCAGGGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGG 

GCATTGTCCCTGGTGGATCTTTTCCACGACACTAATTTCGTTAAGGAGAAAGATTGGCAA 

CTCAGAGACGTGATCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGC 

GAAGCTTTCGAGCGCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAAC 

AACGTGGTTGACGGTAAAGCTAGCTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACC 

GCCAACAAGAAACAATCCGAACCTAGCGAGGAGTACCCAATTATGATCGACGGCGCCGGC 

AATAGGAACTTCCGCCCACTGACTCCCAGGGGCTATACCACCTGGGTCAACACCATCCAG 

ACAAACGGACTTTTGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGC 

ACCTCCGAAGAAATGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAG 

ATCCTGCTCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCA 

AACGCCCCTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGGTAGGTTCATTAGG 

GGACTGGGGGTGCCCCGCGAACGCCAGATGGAGCCAGCATTTGACCAATTTAGGCAGACC 

TACAGACAGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCC 

AAGGCACAGAACATCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTT 

CTGTCCCAGATTAAATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACA 

CTGACTATCCAAAATGCAAATGAAGAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAA 

GATACCCTGGAGGAGAAAATGTACGCATGTCGCGACATTGGCACTACCAAGCAAAAGATG 

ATGCTGCTCGCCAAGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTG 

AAGGGAGGTCCATTGAAAGCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTA 

TCTAGTCAATGTAGAGCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCA 

AAGCAATGCAGAAGTGTTCCAAAAAACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAA 

CAAACTTTCCCGATACAACAGAAGAGTCAGCACAACAAATCTGTTGTACAAGAGACTCCT 

CAGACTCAAAATCTGTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAG 

AAGGATCAAGTAGAGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACAATCTCGAGA 

AGAGGCCCACTACCATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCG 

GAGCCGACACCAGCGTTCTCACTACTGCTCACTATAACAGACTGAAATACAGAGGAAGGA 

AATACCAGGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTG 

TCACCATCAAAAAGAAGGGGAGACACATTAAAACCAGAATGCTGGTCGCCGACATCCCCG 

TCACCATCCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAAC 

TGTCTAAGGAAATCAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAA 

TCCCCCAGTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCC 

TGCTTTCTGAGGGCAAGATTAGCGAGGCCAGCGACAATAACCCTTACAACAGCCCCATCT 

TTGTGATTAAGAAAAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACA 

AGACCGTCCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTA 

AATGCAAGCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATC 
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CTGAATTTCGCCCCTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATA 
AACGCTATGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGA 
AGACACTTCAAGAGATCCTCCAACCTTTCCGCGAAAGATACCCAGAGGTTCAACTCTACC 
AATATATGGACGACCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCA 
TCATCGAACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGC 
AAGAAGTTCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCC 
AGAAGATGCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGG 
GCAATATTACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAA 
CTACAAAAGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGG 
AACTGGAGGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCG 
AAGAAGAAATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCA 
AACAGTCCCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGT 
CCACCGTTAAAAATCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCG 
TCGGCAAGTGCCCCACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGC 
AAAAAGGCTGGTACTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACG 
ACGACTGGAGAATGAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACG 
GCGGAAAGCAAAACGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGC 
AAAAGAGGCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCC 
TTGAGGACACTAGAGACAAGGAGGTGAACATTGTGACTGACAGCTACTACTGCTGGAAAA 
ACATCACAGAGGGCCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGA 
ATATCCGCGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACG 
GCAACCAACTCGCCGATGAAGCCGCCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGG 
GCACACAGATTAAGGAGAAGAGAGACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACG 
ACATCATGATTCCCGTTAGCGACACAAAGATCATTCCAACCGATGTCAAGATCCAGGTGC 
CACCCAATTCATTTGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGA 
TTAACGGGGGAATCATTGATGAAGGATACACCGGCGAAATCCAGGTGATCTGCACAAATA 
TCGGCAAAAGCAATATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCC 
AGCACCACAGCAATTCAAGACAACCTTGGGACGAAAACAAGATTAGCCAGAGAGGTGACA 
AGGGCTTCGGCAGCACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGC 
ACGAGAATTGGCACACCTCCCCTAAGATTTTGGCCCGCAATTACAAGATCCCACTGACTG 
TGGCTAAGCAGATCACACAGGAATGCCCGCACTGCACCAAACAAGGTTCTGGCCCCGCCG 
GCTGCGTGATGAGGTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGACAACAAAA 
TTATCCTGACCTTCGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGG 
AAAATGCATTGTGCACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAAT 
CCCTGCACACCGACAACGGCACCAACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGT 
TCCTGAAAATCGCCCACACCACTGGCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCG 
AGAGGGCCAACAGAACTCTGAAAGAAAAGATCCAATCTCACAGAGACAATACACAGACAT 
TGGAGGCCGCACTTCAGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCG 
GCCAGACCCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGC 
TCTTGCAGCAGGCCCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACG 
ACTGGAAAGGTCCTACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATG 
AGGGCAAGGGGATCATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACT 
GAACCCGGGGCGGCCGCTTCCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGA 
TACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGT 
GAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAAC 
AAGAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAA 
AGCAAGTAAAACCTCTACAAATGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTA 
ATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAAT 
GGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGAC 
CGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGC 
CACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATT 
TAGAGCTTTACGGCACCTCGACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGG 
GCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAG 
TGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTT 
ATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATATT 
TAACGCGAATTTTAACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCC 
TTACGCATCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGA 
AATAACCTCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTG 
GAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCA 
AAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGG 
CAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCC 
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GCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAAT 
TTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTG 
AGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGT 
CTCGAACTTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCG 
GCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCT 
GATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGAC 
CTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACG 
ACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTG 
CTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAA 
GTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCA 
TTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTT 
GTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCC 
AGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGC 
TTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTG 
GGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTT 
GGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAG 
CGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAA 
TGACCGACCAAGCGACGCCCAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTT 
TCATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCGATAAGGATCCGCGTATGG 
TGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCA 
ACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCT 
GTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCG 
AGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTT 
TCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTT 
TTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAA 
TAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTT 
TTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGAT 
GCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAG 
ATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTG 
CTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATA 
CACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGAT 
GGC AT G AC AG T AAG AG AAT TAT GC AG T G C T G C C AT AAC CAT GAG T G AT AAC AC TG CG G CC 
AACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATG 
GGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAAC 
GACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACT 
GGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAA 
GTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCT 
GGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCC 
TCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGA 
CAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTAC 
TCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAG 
ATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCG 
TCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATC 
TGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAG 
CTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTC 
CTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATAC 
CTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACC 
GGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGT 
TCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGT 
GAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGC 
GGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTT 
TATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCA 
GGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTT 
TGCTGGCCTTTTGCTCACATGGCTCGACAGATCT 
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pONY8.3G FB29 - (SEQ ID NO:19) 

AGATCTTGAATAATAAAATGTGTGTTTGTCCGAAATACGCGTTTTGAGATTTCTGTCGCC 

GACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAA 

AAATTGATATTTGAAAATATGGCATATTGAAAATGTCGCCGATGTGAGTTTCTGTGTAAC 

TGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATACGCGATATCTGGCGATAGCGCT 

TATATCGTTTACGGGGGATGGCGATAGACGACTTTGGTGACTTGGGCGATTCTGTGTGTC 

GCAAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGG 

CGACATCAAGCTGGCACATGGCCAATGCATATCGATCTATACATTGAATCAATATTGGCC 

ATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCA 

TACGTTGTATCCATATCGTAATATGTACATTTATATTGGCTCATGTCCAACATTACCGCC 

ATGTTGACATT"GATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCA 

TAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACC 

GCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAAT 

AGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGT 

ACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCC 

CGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTA 

CGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGG 

ATAGCGGTTTGACTCACGGGGATTTCCAAGTCrCCACCCCATTGACGTCAATGGGAGTTT 

GTTITGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTGCGATCGCCCGCC 

CCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGT 

TTAGTGAACCGGGCACTCAGATTCTGCGGTCTGAGTCCCnTCTCTGCTGGGCTGAAAAGG 

CCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATC 

CTACAGTTGGCGCCCGAACAGGGACCTGAGAGGGGCGCAGACCCTACCTGTrGAACCTGG 

CTGATCGTAGGATCCCCGGGACAGCAGAGGAGAACTTACAGAAGTCTTCTGGAGGTGTTC 

CTGGCCAGAACACAGGAGGACAGGTAAGATTGGGAGACCCTTTGACATTGGAGCAAGGCG 

CTCAAGAAGTTAGAGAAGGTGACGGTACAAGGGTCTCAGAAATTAAGTACTGGTAACTGT 

AATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACCAACTTTGTAAAAGAAAAGGAC 

TGGCAGCTGAGGGATGTCATTCCATTGCTGGAAGATGTAACTCAGACGCTGTCAGGACAA 

GAAAGAGAGGCCTTTGAAAGAACATGGTGGGCAATTTCTGCTGTAAAGATGGGCCTCCAG 

attaataatgtagtagatggaaagck:atcattccagctcctaagagcgaaatatgaaaag 

AAGACTGCTAATAAAAAGCAGTCTGAGCCCTCTGAAGAATATCTCTAGAACTAGTGGATC 

CCCCGGGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGCGGATCCGGCCAT 

TAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATA 

CGTTGTATCCATATCATAATATGTACATTTATATTGGCTCATGTCCAACATTACCGCCAT 

GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATA 

GCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGeCCGCCTGGCTGACCGC 

CCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAG 

GGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTAC 

ATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCG 

CCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACG 

TATTAGTCATGGCTATTACCATGGfGATGCGGTTTTGGCAGTACATCAATGGGCGTGGAT 

AGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGT 

TTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGC 

AAATGGGCGGTAGGCATGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACC 

GTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACC 
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GATCCAGCCTCCGCGGCCCCAAGCTTGTTGOGATCCACCGGTCGCCACCATGGTGAGCAA 

GGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAA 

CGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGAC 

CCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCAC 

CCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTT 

CTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGA 

CGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCAT 

CGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTA 

CAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGT 

G AACTTC AAGATCCGCC AC AACATCG AGGACG GC AGCGTGCAGCTCGCCG ACC ACT ACC A 

GCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCAC 

CCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTT 

CGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAAAGCGGCCGCGA 

CTCTAGAGTCGACCTGCAGGCATGCAAGCTTCAGCTGCTCGAGGGGGGGCCCGGTACCCA 

GCnTTTGTTCCCITrAGTGAGGGTTAATTGCGCGGGAAGTATTTATCACTAATC 

AAGTAATACATGAGAAACTTTTACTACAGCAAGCACAATCCTCCAA^ 

ACAAAATCCCTGGTGAACATGATTGGAAGGGACCTACTAGGGTGCTGTGGAAGGGTGATG 

GTGCAGTAGTAGTTAATGATGAAGGAAAGGGAATAATTGCTGTACCATTAACCAGGACTA 

AGTTACTAATAAAACCAAATTGAGTATTGTTGCAGGAAGCAAGACCCAACTACCATTGTC 

AGCTGTGTTTCCTGACCTCAATATTTGTTATAAGGTTTGATATGAATCCCAGGGGGAATC 

TCAACCCCTATTACCCAACAGTCAGAAAAATCTAAGTGTGAGGAGAACACAATGTTTCAA 

CCTTATTGTTATAATAATGACAGTAAGAACAGCATGGCAGAATCGAAGGAAGCAAGAGAC 

CAAGAATGAACCTGAAAGAAGAATCTAAAGAAGAAAAAAGAAGAAATGACTGGTGGAAAA 

TAGGTATGTTTCTGTTATGCITAGCAGGAACTACTGGAGGAATACTTTGGTGGTATGAAG 

GACTCCCACAGCAACATTATATAGGGTTGGTGGCGATAGGGGGAAGATTAAACGGATCTG 

GCCAATCAAATGCTATAGAATGCTGGGGTTCCTTCCCGGGGTGTAGACCATTTCAAAATT 

ACTTCAGTTATGAGACCAATAGAAGCATGCATATGGATAATAATACTGCTACATTATTAG 

AAGCTTTAACCAATATAACTGCTCTATAAATAACAAAACAGAATTAGAAACATGGAAGTT 

AGTAAAGACTTCTGGCATAACTCCTTTACCTATTTCTTCTGAAGCTAACACTGGACTAAT 

TAGACATAAGAGAGATTTTGGTATAAGTGCAATAGTGGCAGCTATTGTAGCCGCTACTGC 

TATTGCTGCTAGCGCTACTATGTCTTATGTTGCTCTAACTGAGGTTAACAAAATAATGGA 

AGTACAAAATCATACTTTTGAGGTAGAAAATAGTACTCTAAATGGTATGGATTTAATAGA 

ACGACAAATAAAGATATTATATGCTATGATTCrTCAAACACATGCAGATGTTCAACTGTT 

AAAGGAAAGACAACAGGTAGAGGAGACATTTAATTTAATTGGATGTATAGAAAGAACACA 

TGTATTTTGTCATACTGGTCATCCCTGGAATATGTCATGGGGACATTTAAATGAGTCAAC 

ACAATGGGATGACTGGGTAAGCAAAATGGAAGATTTAAATCAAGAGATACTAACTACACT 

TCATGGAGCCAGGAACAATTTGGCACAATCCATGATAACATTCAATACACCAGATAGTAT 

AGCTCAATTTGGAAAAGACCnTTGGAGTCATATTGGAAATTGGATTCCTGGATTGGGAGC 

TTCCATTATAAAATATATAGTGATGTTTTTGCTTATTTATTTGTTACTAACCT 

TAAGATCCTCAGGGCCCTCTGGAAGGTGACCAGTGGTGCAGGGTCCTCCGGCAGTCGTTA 

CCTGAAGAAAAAATTCCATCACAAACATGCATCGCGAGAAOACACCTGGGACCAGGCCCA 

ACACAACATACACCTAGCAGGCGTGACCGGTGGATCAGGGGACAAATACTACAAGCAGAA 

GTACTCCAGGAACGACTGGAATGGAGAATCAGAGGAGTACAACAGGCGGCCAAAOAGCTG 

GGTGAAGTCAATCGAGGCATTTGGAGAGAGCTATATTTCCGAGAAGACCAAAGGGGAGAT 

TTCTCAGCCTGGGGCG GCTATC AACG AGC AC AAG AACGGCTCTGG GGGG AAC AATCCTC A 

CCAAGGGTCCnTAGACCTGGAGATTCGAAGCGAAGGAGGAAACATTTATGACTGTTGCAT 

TAAAGCCCAAGAAGGAACTCTCGCTATCCCTrGCTGTGGATTTCCCTTATGGCTATITTG 
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GGGACTAGTAATTATAGTAGGACGCATAGCAGGCTATGGATTACGTGGACTCGCTGTTAT 

AATAAGGATTTGTATTAGAGGCTTAAATTTGATATTTGAAATAATCAGAAAAATGCTTGA 

TTATATTGGAAGAGCTTTAAATCCTGGCACATCTCATGTATCAATGCCTCAGTATGTTTA 

GAAAAACAAGGGGGGAAGTGTGGGGTTTTTATGAGGGGTTTTATAAATGATTATAAGAGT 

AAAAAGAAAGTTGCTGATGCTCTCATAACCTTGTATAACCCAAAGGACTAGCTCATGTTG 

CTAGGCAACTAAACCGCAATAACCGCATTTGTGACGCGAGTTCCCCATTGGTGACGCGTT 

AACTTCCTGTTTTTACAGTATATAAGTGCTTGTATTCTGACAATTGGGCACTCAGATTCT 

GCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGGCCTTTGTAATAAATATAATTCTCTA 

CTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATCCTACAGAGCTCATGCCTTGGCGTAA 

TCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATA 

CGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTA 

ATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGTGATGCCCG 

GGCGGCCGAGGCGGCCTACGTGAACCATCACCCAAATCAAGTTTTTTGCGGTCGAGGTGC 

CGTAAAG CTCTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAGAG CTTG ACGGGG AAAG 

CCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTG 

GC AAGTGTAGCGGTC ACGCTGCGC GTAACCACCAC ACCCGCCGCG CTTAATGCGCCGCTA 

CAGGGCGCGTCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGG 

CCTCTTCGCTATTACGCCAGCCCGGATCGATCCTTATCGGATTTTACCACATTTGTAGAG 

GTTTTACTTGCTTTAAAAAACCTCCCACATCTCCCCCTGAACCTGAAACATAAAATGAAT 

GCAATTGTTGTTGTTAACnTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGC 

ATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCrAGTTGTGG 

CTCATCAATGTATCTTATCATGTCTGCtCGAAGCATTAACCCTCACTAAAGGGAAGCGGC 

CGCCCGGGTCGACTTCACAGGTGTTTGCGGCGTCTTTTGGAGTCTCCGGGCCTCAAGACG 

CGGGGGCTGCTCTGCTCGCCCCACAGCCrTTCTTGTGCCCTCTGGTAGCCTCCCCATGCG 

GAGAAATCGCCCCTCTGGTCCTCGCGGAAGTAGAGCTCCCTCCAGATGCCGCGATTCACC 

TCTCCCAGCTCTTTAGCGGCTTGTTGCACGCCCCTAATTCTCCATTCCAGCCTTTCTTGG 

AGGACCTCGGCTTGCAAAATCTGGCCCCTAATCCACCTATCCCTTCTGGAGGGTGTGTGC 

TCGGGCAGGGACTGTCTCAGCACGCGGCACCACTGGTCCCCCTCCAGGGGGCCTTGTGGG 

TCGATCTTCCACCAGTCGTTGCGGCGCTTCTCCTCTTTGCTCTCTTCCTTGAGGTTCATC 

TCITGATCCCTGGCCTCCnTGCTCTCAGCCATGGTGGCGAATTCTCGAGGCTAGCCTCCC 

GGTGGTGGGTCGGTGGTCCCTGGGCAGGGGTCTCCAGATCCCGGACGAGCCCCCAAATGA 

AAGACCCCCGAGACGGGTAGTCAATCACTCTGAGGAGACCCTCCCAAGGAACAGCGAGAC 

CACGAGTCGGATGCAACAGCAAGAGGATTTATTGGATACACGGGTACCCGGGCGACTCAG 

TCTATCGGAGGACTGGCGCGCCGAGTGAGGGGTTGTGAGCTCTTTTATAGAGCTCGGGAA 

GCAGAAGCGCGCGAACAGAAGCGAGAAGCAGGCTGATTGGTTAATTCAAATAAGGCACAG 

GGTCATTTCAGGTCCTTGGGCKjAGCCTGGAAACATCTGATGGGTCTTAAGAAACTGCTGA 

GGGTTGGGCCATATCTGGGGACCATCTGTTCTTGGCCCCGGGCCGGGGCCGAACCGCGGT 

GACCATCTGTTCTTGGCCCCGGGCCGGGGCCGAAACTGCTCACCGCAGATATCCTGTTTG 

GCCCAACGTTAGCTGTTTTCGTGTACCCGCCCriTGATCTGAACTTCTCTATTC^ 

GGTATTTTTCCATGCCTTGCAAAATGGCGTTACTGCGGCTATCAGGCTAAGCAATTTGAG 

ATCTGGCCG AG GCGGCCTACTCTGCATTAATGAATCG GCCAACGCGCGGG GAGAGGC GGT 

TTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGG 

CTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGG 

GATAACGCAGGAAAGAACATGTATAACTTCGTATAATGTATGCTATACGAAGTTATACAT 

GTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTT 

CC ATAGGCTCCGCCCCCCTG ACG AGCATC AC AAAAATCGACGCTC AAGTCAGAG GTG GCG 



52 



WO 01/79518 



PCT/GB01/01784 



AAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTC 

TCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGGCITTCTCCCTTCGGGAAGCGT 

GGCGCmCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAA 

GCTGGGCTGTGTGCACGAACCGCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTA 

TCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAA 

CAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAA 

CTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTT 

CGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTT 

TTTTGTTTGCAAGCAGCAGATTACGGGCAOAAAAAAAGGATCTCAAGAAGATCCTTTGAT 

CTTTTCTACGGGGTCTrGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCAT 

GAOATTATGAAAAAGGATCITCACCTAGATCCTTTTAAATTAAAAATGAAGT^ 

AATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGC 

ACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTA 

GATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGA 

CCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCG 

CAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGC 

TAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCAT 

CGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAG 

GCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGAT 

CGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAA 

TTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAA 

GTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGA 

taataccgcck:cacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggg 
ck:gaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgc 

ACCCAACTGATCTrCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGG 

AAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACT 

CTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACAT 

ATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGT 

GCCACCTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATC 

AGCTCA1 1 1 m AACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAG 

ACCGAGATAGGGtTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTG 

GACTCCAACGTCAAAGGGCOAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGATAAC 

TTCGTATAATGTATGCTATACGAAGTTATCACTACGTGAACCATCACCCTAATCAAGTTT 

TTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAG 

AGCTTGACGGGGAAAGCCAACCTGGCTTATCGAAATTAATACGACTCACTATAGGGAGAC 
CGGC 



53 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
25 October 2001 (25.10.2001) 




PCT 



iiim 



in 



(10) International Publication Number 

WO 01/79518 A3 



(51) International Patent Classification 7 : CI2N 15/86. 

5/JO.C07K 14/16. 14/I.S.S. CI2N 7/04. A6 IK 48/00 

(21) International Application Number: PCT/GB0I/0I784 

(22) International Filing Date: 18 April 2001 (18.04.2001) 

(25) Filing Language: English 

(26) Publication Language: English 



(30) Priority Data: 

0009760.0 



19 April 2000 (19.04.2000) GB 



(71) Applicant (for all designated Stales except US) : OX FO R D 
BIOMEDICA (UK) LIMITED fGB/GB]: Medawar Cen- 
tre, Robert Robinson Avenue. The Oxford Science Park, 
Oxford OX4 4GA (GB). 

(72) Inventors: and 

(75) Inventors/Applicants (for US only): KINGSMAN, Alan, 
John [GB/GB]: Keepers House. Oaksmere, Appleton. 
Oxfordshire OX 13 5PP (GB). KIM, Narry | KR/US): The 



Howard Hughes Medical Institute. University of Pennsyl- 
vania School of Medicine. 326 Clinical Research Building. 
415 Curie Boulevard, Philadelphia. PA 1 9104-6148 (US). 
KOTSOPOULOU, Ekaterini |GR/US]; bnders 850. 
Childrcns* Hospital. 320 Longwood Avenue. Boston. MA 
02115 (US). ROHLL, Jonathan [GB/GB]: 10 Chapel 
Close. South Stoke. Reading. Berkshire RG8 0JW (GB). 
MITROPHANOUS, Kyriacos, Andreou [GR/GBJ; 39 
Wyiham Road. Oxford OX I 4TR (GB). 

(74) Agents: MALLALIEU, Catherine, Louise el al : L). 
Young & Co., 2 1 New Feller Lane. London EC4A I DA 
(GB). 

(81) Designated States (national): AE. AG. AL. AM, AT. AU. 
AZ. BA. BB. BG. BR, BY. BZ, CA. CH. CN. CO. CR. CU. 
CZ. DE. DK. DM. DZ, EE. ES, FI. GB, GD. GE. GH. GM. 
MR. HU. ID. IL. IN. IS. JP. KE. KG, KP. KR. KZ. LC, LK. 
LR. LS. LT. LU. LV. MA. MD. MG. MK. MN. MW. MX, 
MZ. NO, NZ. PL. PT, RO, RU. SD. SE. SG. SI. SK. SL, 
TJ. TM. TR. TT, TZ. UA. UG, US. UZ. VN. YU. ZA. ZW. 

(84) Designated States (regional): AR1PO patent (GH. GM. 
KE, LS. MW, MZ, SD. SL, SZ. TZ, UG, ZW). Eurasian 

[Continued on next page] 



(54) Title: CODON OPTIMISATION FOR EXPRESSION IN RETROVIRUS PACKAGING CELLS 



3 

oo 

IT) 

ON 



Codon usage in human genes (MH), wild type HIV-1 Gag-poi 
(WT) and the codon optimised HIV-1 Gag-pol (CO) 







MH 


m 


CO 






Ma 


m 


CJ2 






Ma 


vvi 


£2 






MM 


WT 


£52 


Ala 


A 


13 


46 


8 


Cys 


C 


68 


10 


70 


Leu 


A 


3 


14 


3 


Sor 


C 


34 


35 


55 


GC 


C 


53 


19 


65 


TG 


T 


32 


90 


30 


CT 


c 


26 


e 


17 


AG 


T 


10 


10 


3 




G 


17 


11 


s 














G 


58 


14 


70 


TC 


A 


s 


38 


17 




T 


17 


24 


19 


Gin 


A 


12 


53 


21 




T 


5 


1 1 


6 




c 


28 


10 


14 












CA 


G 


88 


47 


79 


TT 


A 


2 


42 


6 




G 


9 


3 


7 


Arg 


A 


10 


58 


10 














G 


6 


11 


0 




T 


13 


3 


3 


AG 


G 


IB 


29 


_1 1 


Glu 


A 


25 


65 


38 






















CG 


A 


6 


6 


0 


GA 


c 


75 


35 


62 




A 


ia 


58 


2B 


THr 


A 


14 


45 


16 




C 


37 


0 


61 












AA 


G 


82 


42 


72 


AC 


C 


57 


29 


52 




G 


21 


6 


10 


Gly 


A 


14 


53 


21 














G 


15 


0 


19 




T 


7 


0 


s 


GG 


c 


60 


21 


55 


Ph» 


C 


80 


45 


45 




T 


14 


26 


13 














G 


24 


24 


24 


TT 


T 


20 


55 


55 












Asn 


C 


78 


29 


71 




T 


12 


3 


0 












Tyr 


C 


74 


20 


80 


AA 


T 


22 


71 


29 












Pro 


A 


16 


52 


24 


TA 


T 


26 


80 


20 












His 


C 


79 


30 


90 


cc 


c 


48 


15 


39 












Asp 


C 


75 


64 


70 


CA 


T 


21 


70 


10 




G 


17 


3 


21 


Vat 


A 


5 


66 


4 


OA 


T 


25 


36 


30 














T 


19 


30 


15 


GT 


C 


25 


e 


20 



5 58 S 

»e 19 92 

77 23 O 



24 76 

12 0 



o 



(57) Abstract: A method of producing a replication defective retrovirus comprising iransfecting a producer cell with the following: 
iii) a retroviral genome; iv) a nuclcoiide sequence coding for retroviral gag and pol proteins: and iii) nucleotide sequences encod- 
ing other essential viral packaging components not encoded by the nucleotide sequence of (ii): characterised in that the nucleotide 
sequence coding for retroviral gag and pol proteins is codon optimised for expression in the producer cell. 
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